2016
DOI: 10.1101/074385
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Who’s who? Detecting and resolving sample anomalies in human DNA sequencing studies with peddy

Abstract: The potential for genetic discovery in human DNA sequencing studies is greatly diminished if DNA samples from the cohort are mislabelled, swapped, contaminated, or include unintended individuals. Unfortunately, the potential for such errors is significant since DNA samples are often manipulated by several protocols, labs or scientists in the process of sequencing. We have developed peddy to identify and facilitate the remediation of such errors via interactive visualizations and reports comparing the stated se… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 12 publications
(16 citation statements)
references
References 13 publications
(12 reference statements)
0
16
0
Order By: Relevance
“…Sample identity was verified by comparing sex and genotype between the WGS and RNA-seq data. In the WGS data, sex was determined from chromosome X heterozygosity using Peddy (v0.3.2; (Pedersen and Quinlan, 2017)), with the Peddy hg19.sites converted to GRCh38 using the UCSC Genome Browser LiftOver utility. High-quality variants with an allele frequency ≥1% were exported from the VCF using Hail for input into Peddy.…”
Section: Data Quality and Sample Identity Assessmentmentioning
confidence: 99%
“…Sample identity was verified by comparing sex and genotype between the WGS and RNA-seq data. In the WGS data, sex was determined from chromosome X heterozygosity using Peddy (v0.3.2; (Pedersen and Quinlan, 2017)), with the Peddy hg19.sites converted to GRCh38 using the UCSC Genome Browser LiftOver utility. High-quality variants with an allele frequency ≥1% were exported from the VCF using Hail for input into Peddy.…”
Section: Data Quality and Sample Identity Assessmentmentioning
confidence: 99%
“…We sequenced the genomes of 603 individuals from 33, three-generation CEPH/Utah pedigrees to a genome-wide median depth of ~30X (Figure 1-figure supplement 1, Supplementary File 1), and removed 10 samples from further analysis following quality control using peddy (Pedersen and Quinlan 2017). After standard quality filtering, we identified a total of 4,671 germline de novo mutations in 70 second-generation individuals, each of which was transmitted to at least one offspring in the third generation (Figure 1a, Supplementary File 2).…”
Section: Identifying High-confidence Dnms Using Transmission To a Thimentioning
confidence: 99%
“…We used peddy (Pedersen and Quinlan 2017) to perform relatedness and sample sequencing quality checks on all CEPH/Utah samples. We discovered a total of 10 samples with excess levels of heterozygosity (ratio of heterozygous to homozygous alternate calls > 0.2).…”
Section: Sample Quality Control and Filteringmentioning
confidence: 99%
See 2 more Smart Citations