2019
DOI: 10.1038/s41588-018-0316-4
|View full text |Cite
|
Sign up to set email alerts
|

Fast and accurate genomic analyses using genome graphs

Abstract: The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, which impairs read alignment and downstream analysis accuracy. Reference genome structures incorporating known genetic variation have been shown to improve the accuracy of genomic analyses, but have so far remained computationally prohibitive for routine large-scale use. Here we present a graph genome implementation that enables re… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
126
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 183 publications
(129 citation statements)
references
References 59 publications
2
126
0
1
Order By: Relevance
“…We present the first diploid small variant benchmark that uses short-, linked-, and long-reads to confidently characterize a broad spectrum of genomic contexts, including non-repetitive regions as well as repetitive regions such as many segmental duplications, difficult to map regions, homopolymers, and tandem repeats. We demonstrated that the benchmark reliably identifies false positives and false negatives in more challenging regions across many short-, linked-, and long-read technologies and variant callers based on traditional methods, deep learning, 8,9 graph-based references, 10 and diploid assembly. 12 We designed this benchmark to cover as much of the human genome as possible with current technologies, as long as the benchmark genome sequence is structurally similar to the GRCh37 or GRCh38 reference.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…We present the first diploid small variant benchmark that uses short-, linked-, and long-reads to confidently characterize a broad spectrum of genomic contexts, including non-repetitive regions as well as repetitive regions such as many segmental duplications, difficult to map regions, homopolymers, and tandem repeats. We demonstrated that the benchmark reliably identifies false positives and false negatives in more challenging regions across many short-, linked-, and long-read technologies and variant callers based on traditional methods, deep learning, 8,9 graph-based references, 10 and diploid assembly. 12 We designed this benchmark to cover as much of the human genome as possible with current technologies, as long as the benchmark genome sequence is structurally similar to the GRCh37 or GRCh38 reference.…”
Section: Discussionmentioning
confidence: 99%
“…[2][3][4] The Global Alliance for Genomics and Health (GA4GH) Benchmarking Team develop tools and best practices to use these benchmarks. 5 These benchmarks and benchmarking tools helped enable the development and optimization of new technologies and bioinformatics approaches, including linked reads, 6 highly accurate long reads, 7 deep learning-based variant callers, 8,9 graph-based variant callers, 10 and de novo assembly. 11,12 However, these benchmarks did not cover some challenging regions that these new methods could access, including many known medically relevant genes.…”
Section: Introductionmentioning
confidence: 99%
“…There is growing interest in using genetic variants to augment the reference genome into a graph genome [18][19][20]. To create a representative graph genome, the full spectrum of structural variations, including the alternate alleles, should be understood clearly.…”
Section: Discussionmentioning
confidence: 99%
“…SVs are abundant in plant genomes (Torkamaneh et al 2018) and play an important role in phenotypic variation among germplasm collections (Wang et al 2018). We think that this problem will be overcome thanks to long-read sequencing along with the development of graph-based reference genomes (Rakocevic et al 2019).…”
Section: Current Limitationsmentioning
confidence: 99%