vs.

GRCh37 vs. GRCh38

What's the Difference?

GRCh37 and GRCh38 are both versions of the human reference genome, with GRCh38 being the most recent and updated version. GRCh38 includes improvements such as updated gene annotations, more accurate representation of repetitive regions, and better coverage of genetic variations. Additionally, GRCh38 incorporates data from the 1000 Genomes Project and other sources to provide a more comprehensive and accurate reference for genetic research. Overall, GRCh38 is considered to be a more refined and reliable version of the human reference genome compared to GRCh37.

Comparison

AttributeGRCh37GRCh38
Release DateFebruary 2009December 2013
Number of Genes20,00019,000
Number of SNPs~28 million~80 million
Assembly Length3.1 billion base pairs3.2 billion base pairs

Further Detail

Introduction

The Genome Reference Consortium (GRC) periodically releases updated versions of the human reference genome to incorporate new discoveries and improve accuracy. Two of the most widely used versions are GRCh37 and GRCh38. In this article, we will compare the attributes of these two versions to understand the improvements and changes that have been made.

Assembly Quality

GRCh37, also known as hg19, was the first version of the human reference genome to be based on the human genome project. It was released in 2009 and served as the standard for many years. However, GRCh38, released in 2013, represents a significant improvement in assembly quality. GRCh38 includes more accurate representations of repetitive regions and structural variations, leading to a more comprehensive and reliable reference genome.

Annotation Updates

One of the key differences between GRCh37 and GRCh38 lies in the annotation updates. GRCh38 incorporates the latest gene annotations, including information on alternative splicing, non-coding RNAs, and regulatory elements. This updated annotation provides researchers with a more detailed understanding of the human genome and its functional elements, allowing for more precise analyses and interpretations of genomic data.

Contig Length

Another important aspect to consider when comparing GRCh37 and GRCh38 is the contig length. Contigs are contiguous sequences of DNA that make up the reference genome assembly. GRCh38 features longer contigs compared to GRCh37, which improves the accuracy of mapping and alignment of sequencing reads. This longer contig length in GRCh38 results in fewer gaps and ambiguities in the reference genome.

Chromosome Representation

GRCh37 and GRCh38 also differ in their representation of chromosomes. GRCh38 includes updated chromosome representations, such as the inclusion of the centromere and telomere sequences. This enhanced representation provides a more complete picture of the human genome and its structural organization. Researchers can now study chromosomal regions that were previously missing or incomplete in GRCh37.

Compatibility with New Technologies

As technology advances, it is essential for the reference genome to be compatible with new sequencing technologies. GRCh38 has been optimized for compatibility with next-generation sequencing platforms, such as Illumina and PacBio. This compatibility ensures that researchers can accurately analyze and interpret data generated from these advanced sequencing technologies, making GRCh38 the preferred choice for many genomic studies.

Population Diversity

Population diversity is another factor to consider when comparing GRCh37 and GRCh38. GRCh38 includes sequences from a more diverse set of populations, reflecting the genetic diversity of the human population. This diversity is crucial for studying genetic variations and disease susceptibility across different populations. By incorporating a broader range of genetic information, GRCh38 provides a more comprehensive reference for population-based studies.

Conclusion

In conclusion, GRCh38 represents a significant improvement over GRCh37 in terms of assembly quality, annotation updates, contig length, chromosome representation, compatibility with new technologies, and population diversity. Researchers looking to conduct genomic studies should consider using GRCh38 for its enhanced accuracy, completeness, and compatibility with modern sequencing technologies. As the field of genomics continues to evolve, having a reliable and up-to-date reference genome like GRCh38 is essential for advancing our understanding of the human genome and its role in health and disease.

Comparisons may contain inaccurate information about people, places, or facts. Please report any issues.