2021
DOI: 10.1038/s41467-020-20850-5
|View full text |Cite
|
Sign up to set email alerts
|

PopDel identifies medium-size deletions simultaneously in tens of thousands of genomes

Abstract: Thousands of genomic structural variants (SVs) segregate in the human population and can impact phenotypic traits and diseases. Their identification in whole-genome sequence data of large cohorts is a major computational challenge. Most current approaches identify SVs in single genomes and afterwards merge the identified variants into a joint call set across many genomes. We describe the approach PopDel, which directly identifies deletions of about 500 to at least 10,000 bp in length in data of many genomes jo… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
2

Relationship

4
1

Authors

Journals

citations
Cited by 10 publications
(12 citation statements)
references
References 54 publications
0
12
0
Order By: Relevance
“…S12 ) as a sanity check that our variant calls can be used to clearly distinguish samples from different continental groups. Analogously to previous call set evaluations ( Niehus et al , 2021 ), we converted the genotypes into a variant-sample matrix containing NRS variant allele counts and filtered uninformative NRS variants and those in linkage disequilibrium. As a result, the principal component analysis was calculated on 1787 variants.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…S12 ) as a sanity check that our variant calls can be used to clearly distinguish samples from different continental groups. Analogously to previous call set evaluations ( Niehus et al , 2021 ), we converted the genotypes into a variant-sample matrix containing NRS variant allele counts and filtered uninformative NRS variants and those in linkage disequilibrium. As a result, the principal component analysis was calculated on 1787 variants.…”
Section: Resultsmentioning
confidence: 99%
“…For both PopIns and PopIns2, we calculated the Mendelian inheritance error rate and transmission rate as in Niehus et al (2021) . The Mendelian inheritance error rate is a measure to assess the plausibility of variant genotypes in related individuals.…”
Section: Resultsmentioning
confidence: 99%
“…In addition, we performed a principal component analysis (Supplement Figure 12) as a sanity check that our variant calls can be used to clearly distinguish samples from different continental groups. Analogously to previous call set evaluations [Niehus et al , 2021], we converted the genotypes into a variant-sample matrix containing NRS variant allele counts and filtered uninformative NRS variants and those in linkage disequilibrium. As a result, the principal component analysis was calculated on 1787 variants.…”
Section: Resultsmentioning
confidence: 99%
“…We previously noted its similarity to a classical genome assembly problem [Kehr et al, 2016]. Classical genome assembly for short read data is commonly based on de Bruijn graphs (DBG) [Pevzner et al, 2001, Compeau et al, 2011. The tool Cortex [Iqbal et al, 2012] augmented DBGs with colors in order to simultaneously process several genomes.…”
Section: Introductionmentioning
confidence: 99%
“…We attempted replication for each of the 33 trait-associated SVs using a combination of short-read and long-read WGS data and genotype imputation. We utilized independent datasets composed of Icelandic (deCODE genetics) [17][18][19] and multi-ancestry (UK Biobank, UKBB) 20 participants. Note that the SV calling and genotyping algorithms used in replication datasets (described under Methods) are different from the Parliament2 pipeline used for SV discovery in TOPMed.…”
Section: Replication Of Significant Sv-blood Cell Trait Associationsmentioning
confidence: 99%