2020
DOI: 10.1101/2020.03.05.974907
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Competitive mapping allows to identify and exclude human DNA contamination in ancient faunal genomic datasets

Abstract: 23After over a decade of developments in field collection, laboratory methods and advances in high-24 throughput sequencing, contamination remains a key issue in ancient DNA research. Currently, 25 human and microbial contaminant DNA still impose challenges on cost-effective sequencing and 26 accurate interpretation of ancient DNA data. Here we investigate whether human contaminating 27 DNA can be found in ancient faunal sequencing datasets. We identify variable levels of human 28 contamination, which persists… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 54 publications
0
6
0
Order By: Relevance
“…The classification showed the presence of four mammalian taxa with more than 2% of the Eukaryotic classified reads, which we investigated in further analyses: Ovis aries, Bos taurus, Homo sapiens and Canis lupus . To separate the sequencing reads of the four major mammalian taxa we built a multi-fasta reference file with the genomes of: H. sapiens (GRCh37 Assembly GCA_000001405.1), O. aries (Oar_v3.1, assembly, GCA_000298735.1), B. taurus (ARS-UCD1.2 assembly, GCA_002263795.2) and C. lupus (CanFam3.1 assembly, GCF_000002285.3) following a similar strategy described in Feuerborn et al (Feuerborn et al, 2020). The filtered reads were aligned with bwa aln (Li and Durbin, 2009) disabling seeding, and with a gap open penalty of two.…”
Section: Methods Detailsmentioning
confidence: 99%
See 1 more Smart Citation
“…The classification showed the presence of four mammalian taxa with more than 2% of the Eukaryotic classified reads, which we investigated in further analyses: Ovis aries, Bos taurus, Homo sapiens and Canis lupus . To separate the sequencing reads of the four major mammalian taxa we built a multi-fasta reference file with the genomes of: H. sapiens (GRCh37 Assembly GCA_000001405.1), O. aries (Oar_v3.1, assembly, GCA_000298735.1), B. taurus (ARS-UCD1.2 assembly, GCA_002263795.2) and C. lupus (CanFam3.1 assembly, GCF_000002285.3) following a similar strategy described in Feuerborn et al (Feuerborn et al, 2020). The filtered reads were aligned with bwa aln (Li and Durbin, 2009) disabling seeding, and with a gap open penalty of two.…”
Section: Methods Detailsmentioning
confidence: 99%
“…lupus (CanFam3.1 assembly, GCF_000002285.3) following a similar strategy described in Feuerborn et al (Feuerborn et al, 2020) . The filtered reads were aligned with bwa aln The characteristics and quality of the mapped reads was assessed with qualimap 2.2.1 (Okonechnikov et al, 2016) .…”
Section: Bioinformatic Processing Of Sample Sat29mentioning
confidence: 99%
“…Small amounts of modern contamination in archaic sequencing experiments can “modernize” ancient individuals, leading to incorrect inferences of population history and archaic admixture 28–30 (Figure 2d). aDNA studies should always explicitly address the measures that were taken, both in handling and extracting the sample in the lab and in processing the sequence data, to measure and mitigate the effects of contamination 111 …”
Section: Alternative Explanationsmentioning
confidence: 99%
“…aDNA studies should always explicitly address the measures that were taken, both in handling and extracting the sample in the lab and in processing the sequence data, to measure and mitigate the effects of contamination. 111 …”
Section: Alternative Explanationsmentioning
confidence: 99%
“…BLAST is also used in this context, even though it was not originally designed for the purpose of binning (Altschul et al, 1990). However, since the premise of taxonomic binning is straightforward, many studies forgo specialized software and align their query sequences to a reference database themselves, for example using bwa or bowtie2 (Langmead & Salzberg, 2012), then assign query reads to species using least mismatch or exact match as described above or something similar (Anari, 2020;Feuerborn et al, 2020;de Filippo et al, 2018;Key et al, 2017;Pedersen et al, 2021;Pru¨fer et al, 2010;Warinner et al, 2017). Common choices of reference database are NCBI GenBank (Clark et al, 2015), Refseq (O'Leary et al, 2015), Ensembl (Howe et al, 2020) or a curated set of reference sequences built to suit the metagenomic data set in consideration.…”
mentioning
confidence: 99%