2023
DOI: 10.21203/rs.3.rs-2515453/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Structural variation across 138,134 samples in the TOPMed consortium

Abstract: Ever larger Structural Variant (SV) catalogs highlighting the diversity within and between populations help researchers better understand the links between SVs and disease. The identification of SVs from DNA sequence data is non-trivial and requires a balance between comprehensiveness and precision. Here we present a catalog of 355,667 SVs (59.34% novel) across autosomes and the X chromosome (50bp+) from 138,134 individuals in the diverse TOPMed consortium. We describe our methodologies for SV inference result… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 64 publications
0
6
0
Order By: Relevance
“…We next investigated how many of these variants have been identified previously 19,56 . For this task we used ICA to annotate variant intersections to 1kGP, gnomAD and TOPMed.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…We next investigated how many of these variants have been identified previously 19,56 . For this task we used ICA to annotate variant intersections to 1kGP, gnomAD and TOPMed.…”
Section: Resultsmentioning
confidence: 99%
“…For variant annotation, we used the merged variant file that is normalized and the multi-allelic sites split into different lines. We extracted the variants for 2,504 unrelated samples and then the annotation was done using Illumina Connected Annotations (ICA) from three different sources: gnomAD, 1kGP and TOPMed 19,84 . The novel variants are the ones with counts for all these four sources marked as zero in the annotated VCF file and the remaining i.e., with count > 0 for at least one source is considered to be known variant.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…While most studies have focussed on single nucleotide polymorphisms (SNPs) and small insertions and deletions (Indels) in the past, there is an increasing interest also in structural variants (SVs), which are an important contributor to human phenotypes in general and to genetic diseases in particular [6][7][8][9][10][11][12]. They are known to affect more nucleotides and have higher impact on gene functions compared to SNPs or Indels [13].…”
Section: Introductionmentioning
confidence: 99%
“…As many rare and low-frequency variants associated with diseases tend to be population-specific (11), an increasing number of human genome projects are focusing on specific populations to provide population-specific reference panels. However, most of the whole-genome sequencing (WGS) efforts were carried out in European-descent individuals, such as the GoNL project (the Genome of the Netherlands project, n=769) (12), UK10K project (∼4,000 WGS and ∼6,000 whole exome sequencing samples in UK)(13, 14) and the TOPMed program (Trans-Omics for Precision Medicine, n=∼138,000) (15). These efforts have enabled the provision of a large-scale combined reference panel for the European population, such as the HRC panel (Haplotype Reference Consortium, n=32,470) (16).…”
Section: Introductionmentioning
confidence: 99%