2012
DOI: 10.1016/j.ajhg.2012.09.004
|View full text |Cite
|
Sign up to set email alerts
|

Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data

Abstract: DNA sample contamination is a serious problem in DNA sequencing studies and may result in systematic genotype misclassification and false positive associations. Although methods exist to detect and filter out cross-species contamination, few methods to detect within-species sample contamination are available. In this paper, we describe methods to identify within-species DNA sample contamination based on (1) a combination of sequencing reads and array-based genotype data, (2) sequence reads alone, and (3) array… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
392
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 446 publications
(393 citation statements)
references
References 11 publications
(6 reference statements)
1
392
0
Order By: Relevance
“…In addition, adapter sequences can be automatically detected and removed from genomic data sets (often simultaneously with a quality trimming of the data) using one of the software packages described in Table 2. Different mitigation techniques also exist for the removal of genomic contaminants, and some of the most popular ones include ContEst (Cibulskis et al, 2011), DeconSeq (Schmieder and Edwards, 2011a), QC-Chain (Zhou et al, 2013) as well as the set of methods developed by Jun et al, 2012 used in the 1000 Genomes Project.…”
Section: Quality Assessmentmentioning
confidence: 99%
“…In addition, adapter sequences can be automatically detected and removed from genomic data sets (often simultaneously with a quality trimming of the data) using one of the software packages described in Table 2. Different mitigation techniques also exist for the removal of genomic contaminants, and some of the most popular ones include ContEst (Cibulskis et al, 2011), DeconSeq (Schmieder and Edwards, 2011a), QC-Chain (Zhou et al, 2013) as well as the set of methods developed by Jun et al, 2012 used in the 1000 Genomes Project.…”
Section: Quality Assessmentmentioning
confidence: 99%
“…There are several methods to detect cross-sample contamination. However, all of them are usually supported by the additional information about mutations in other samples in a batch [226] or known genotypes [222].…”
Section: What Is Done In the Area To Solve Cross-contaminationmentioning
confidence: 99%
“…Contaminated samples often have unusually high levels of heterozygosity [222,223]. It is advised either to exclude contaminated samples from analysis, or model sample contamination during analysis to obtain more accurate SNP and genotype calls.…”
Section: Cross-contamination Of Samplesmentioning
confidence: 99%
See 1 more Smart Citation
“…And Jun et al [5] demonstrates a likelihood-based method that could detectDNA sample contamination either using sequence data alone or with array-based genotypes. Although both methods are sensitive for estimating levels of contamination as low as 1%~1.5%, neither of them considers the genomic feature of large proportion of repeated sequences and pseudoautosomal (PAR) gene region on sex chromosomes.…”
Section: Introductionmentioning
confidence: 99%