2020
DOI: 10.1038/s41586-020-2287-8
|View full text |Cite|
|
Sign up to set email alerts
|

A structural variation reference for medical and population genetics

Abstract: Structural variants (SVs) rearrange large segments of DNA1 and can have profound consequences in evolution and human disease2,3. As national biobanks, disease-association studies, and clinical genetic testing have grown increasingly reliant on genome sequencing, population references such as the Genome Aggregation Database (gnomAD)4 have become integral in the interpretation of single-nucleotide variants (SNVs)5. However, there are no reference maps of SVs from high-coverage genome sequencing comparable to tho… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

37
668
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
4

Relationship

1
9

Authors

Journals

citations
Cited by 658 publications
(706 citation statements)
references
References 52 publications
37
668
1
Order By: Relevance
“…The initial discovery effort from the 1000 Genomes Project 12,13 revealed that a diverse landscape of SVs could be captured from srWGS with just 4-7X coverage (3,422 SVs per genome), and more recent population genetic and human disease studies using deeper (30X or higher) srWGS and diverse methods have varied in estimates of SVs that can be captured using srWGS from 401 -10,884 per genome, with the highest end of this range generated from the Human Genome Structural Variation Consortium (HGSVC; Figure 1A) . 1,[13][14][15][16][17][18] Emerging long-read WGS (lrWGS) technologies, which involve sequencing thousands to millions of contiguous nucleotides from a single strand of DNA, are better suited for SV discovery than srWGS. The most widely tested lrWGS technologies include single-molecule real-time (SMRT) sequencing from Pacific Biosciences (PacBio) and sequencing by ionic current through a nanopore channel (Oxford Nanopore Technologies [ONT]).…”
Section: Main Textmentioning
confidence: 99%
“…The initial discovery effort from the 1000 Genomes Project 12,13 revealed that a diverse landscape of SVs could be captured from srWGS with just 4-7X coverage (3,422 SVs per genome), and more recent population genetic and human disease studies using deeper (30X or higher) srWGS and diverse methods have varied in estimates of SVs that can be captured using srWGS from 401 -10,884 per genome, with the highest end of this range generated from the Human Genome Structural Variation Consortium (HGSVC; Figure 1A) . 1,[13][14][15][16][17][18] Emerging long-read WGS (lrWGS) technologies, which involve sequencing thousands to millions of contiguous nucleotides from a single strand of DNA, are better suited for SV discovery than srWGS. The most widely tested lrWGS technologies include single-molecule real-time (SMRT) sequencing from Pacific Biosciences (PacBio) and sequencing by ionic current through a nanopore channel (Oxford Nanopore Technologies [ONT]).…”
Section: Main Textmentioning
confidence: 99%
“…The number and composition of SVs should be compared with previous studies of similar cohorts to identify problematic samples and SVs. For germline SVs, a recent large population-based study reported an average of 4400 germline SVs per individual (Abel et al, 2020), and the Genome Aggregation Database reported 7400 germline SVs per individual on average (Collins et al, 2020). Both studies were based on Illumina short reads.…”
Section: Quality Controlmentioning
confidence: 99%
“…2; 3 Structural variants are genomic rearrangements involving more than 50 nucleotides that contribute to genomic diversity and function, evolution, and can cause somatic and germline diseases. [4][5][6] Despite improvements in genomic technologies, characterization of SVs remains challenging and the full spectrum of SVs is not achieved by routine methods such as microarrays or other targeted sequencing approaches. In ATD, the detection and characterization of SVs remain particularly challenging due to the high number of repetitive elements in and around SERPINC1 (35% of SERPINC1 sequence are interspersed repeats).…”
Section: Main Textmentioning
confidence: 99%