2022
DOI: 10.1126/science.abl3533
|View full text |Cite
|
Sign up to set email alerts
|

A complete reference genome improves analysis of human genetic variation

Abstract: Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands of structural errors, and unlocks the most complex regions of the human genome for clinical and functional study. We show how this reference universally improves read mapping and variant calling for 3202 and 17 globally diverse samples sequenced with short and long reads, respectively. We identify hundreds of thousands of variants per sample in previously unresolved regions, sh… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

1
119
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 175 publications
(121 citation statements)
references
References 107 publications
1
119
0
Order By: Relevance
“…However, the role of satellite repeats and their associated constitutive heterochromatin remains unclear, as their high repetitiveness makes them almost entirely missing from the genome assembly in the past two decades, and a lack of in vivo models to determine their functional impacts also hinders their study. Recent advances in fully sequencing the T2T-CHM13 genome now demonstrate the rich genetic and epigenetic variations hidden in the repeat elements of the human genome (Nurk et Gershman et al, 2022;Hoyt et al, 2022;Aganezov et al, 2022). Our finding of hyper-variability of heterochromatin landscape in human neocortical neurons as well as their physical and functional regulation by NDE1/Nde1 aligns fully with the new human genome information.…”
Section: Discussionsupporting
confidence: 69%
“…However, the role of satellite repeats and their associated constitutive heterochromatin remains unclear, as their high repetitiveness makes them almost entirely missing from the genome assembly in the past two decades, and a lack of in vivo models to determine their functional impacts also hinders their study. Recent advances in fully sequencing the T2T-CHM13 genome now demonstrate the rich genetic and epigenetic variations hidden in the repeat elements of the human genome (Nurk et Gershman et al, 2022;Hoyt et al, 2022;Aganezov et al, 2022). Our finding of hyper-variability of heterochromatin landscape in human neocortical neurons as well as their physical and functional regulation by NDE1/Nde1 aligns fully with the new human genome information.…”
Section: Discussionsupporting
confidence: 69%
“…4a). Additionally, we found that CHM13v1.0 had 33× fewer false or rare collapses than GRCh38 (~185 loci covering 6.84 Mb) 6 . We identified five regions (160 kb) with rare duplications in CHM13v1.0.…”
Section: Identification and Correction Of Assembly Errorsmentioning
confidence: 67%
“…In summary, we found 7.5× fewer rare or falsely duplicated bases in CHM13v1.0 relative to the 12 likely falsely duplicated regions affecting 1.2 Mb and 74 genes in GRCh38 (ref. 6 ), including the medically relevant CBS, CRYAA and KCNE1 genes 46 .…”
Section: Identification and Correction Of Assembly Errorsmentioning
confidence: 99%
See 1 more Smart Citation
“…Notably, most recent population-scale studies of SVs have used pan-genomic models to eliminate reference bias and successfully scaled up the population size for genotyping from a few hundred to thousands of samples [4] , [32] , [63] , [64] , [65] , facilitating other genotype-based downstream analyses. Recently, the Telomer-to-Telomere (T2T) Consortium and the Human Pangenome Reference Consortium have successively announced their exciting progress in constructing complete and error-free T2T assemblies of all chromosomes as well as full-spectrum genomic variant collections [116] , [117] , [118] , which will further promote the application of pan-genomic approaches in population genetic studies.…”
Section: Discussionmentioning
confidence: 99%