2017
DOI: 10.1186/s12915-017-0366-6
|View full text |Cite
|
Sign up to set email alerts
|

Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions

Abstract: BackgroundContamination is a well-known but often neglected problem in molecular biology. Here, we investigated the prevalence of cross-contamination among 446 samples from 116 distinct species of animals, which were processed in the same laboratory and subjected to subcontracted transcriptome sequencing.ResultsUsing cytochrome oxidase 1 as a barcode, we identified a minimum of 782 events of between-species contamination, with approximately 80% of our samples being affected. An analysis of laboratory metadata … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

1
125
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 106 publications
(128 citation statements)
references
References 48 publications
1
125
0
Order By: Relevance
“…Currently, the rapid advances of the next‐generation sequencing (NGS) technologies have increased the sequencing throughput and lowered sequencing costs, allowing a wide range of large‐scale genomic studies. However, although the quality of sequence data has been improved, the methods and protocols are still not perfect and errors inevitably occur . Indeed, it is known that data from massive sequencing projects based on NGS technologies (e.g., Illumina, 454 and Ion Torrent) contain contaminant sequence reads, difficult to detect and remove, that are lost in the myriad of reads from the target sample .…”
Section: Introductionmentioning
confidence: 99%
See 4 more Smart Citations
“…Currently, the rapid advances of the next‐generation sequencing (NGS) technologies have increased the sequencing throughput and lowered sequencing costs, allowing a wide range of large‐scale genomic studies. However, although the quality of sequence data has been improved, the methods and protocols are still not perfect and errors inevitably occur . Indeed, it is known that data from massive sequencing projects based on NGS technologies (e.g., Illumina, 454 and Ion Torrent) contain contaminant sequence reads, difficult to detect and remove, that are lost in the myriad of reads from the target sample .…”
Section: Introductionmentioning
confidence: 99%
“…NGS library construction protocols involve one or multiple polymerase chain reaction (PCR) steps that are particularly sensitive to contamination. In fact, initially small amounts of foreign DNA can accidentally be amplified by PCR contaminating the downstream data sets . This issue becomes particularly serious when the target sample is derived from a non‐model species lacking a reference genome, so that genuine sequence reads cannot be easily identified by similarity .…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations