2020
DOI: 10.1101/2020.06.17.157347
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Validated removal of nuclear pseudogenes and sequencing artefacts from mitochondrial metabarcode data

Abstract: 1. Metabarcoding of Metazoa using mitochondrial genes is confounded by the co-amplification of mitochondrial pseudogenes (NUMTs). Current denoising protocols have been designed to remove PCR and sequencing artefacts, but pseudogenes are not usually recognised by these procedures.Authentic mitochondrial amplicon sequence variants (ASVs), which represent the majority of reads, can be distinguished from PCR-derived errors, sequencing errors and NUMTs (non-authentic ASVs) due to their lower abundances. However, th… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
53
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 26 publications
(53 citation statements)
references
References 51 publications
(65 reference statements)
0
53
0
Order By: Relevance
“…Haplotypes from each region were further filtered to remove likely nuclear mitochondrial (numts) pseudogenes, following a protocol based on the relative abundance of codistributed reads (Andújar et al., 2020). The set of putative haplotypes for Acari, Collembola and Coleoptera was used to generate a community table with read counts (haplotype abundance) by sample against the complete collection of reads (i.e.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Haplotypes from each region were further filtered to remove likely nuclear mitochondrial (numts) pseudogenes, following a protocol based on the relative abundance of codistributed reads (Andújar et al., 2020). The set of putative haplotypes for Acari, Collembola and Coleoptera was used to generate a community table with read counts (haplotype abundance) by sample against the complete collection of reads (i.e.…”
Section: Methodsmentioning
confidence: 99%
“…The methodology involves the bulk sequencing of mixed communities and subsequent clustering of DNA reads into operational taxonomic units (OTUs) that broadly represent the species category. While an efficient method to approximate community profiles at the species level, precise removal of primary DNA reads affected by sequencing errors (Andújar, Arribas, Yu, Vogler, & Emerson, 2018; Elbrecht, Vamos, Steinke, & Leese, 2018; Turon, Antich, Palacín, Præbel, & Wangensteen, 2019) and co‐amplified nuclear mitochondrial copies (numts; Andújar et al., 2020) would avert the need for clustering. Read‐based data raise the prospect of reliable haplotype information from mitochondrial COI cMBC, which represents a step change for the study of diversity patterns through whole‐community genetic analyses at haplotype‐level resolution.…”
Section: Introductionmentioning
confidence: 99%
“…Another python package called ‘Alfie’ calculates k-mer frequencies and classifies COI metabarcode sequences to the kingdom rank using a machine learning method [56]. A new program, called NUMTdumper, has been developed as a stand-alone program meant to be incorporated into bioinformatic pipelines [57]. NUMTdumper provides a method to screen for NuMTs based on read counts while acknowledging the trade-offs between removing all possible NuMTs while erroneously removing genuine reads.…”
Section: Discussionmentioning
confidence: 99%
“…Baselga et al., 2013, 2015; Craft et al., 2010; Gómez‐Rodríguez et al., 2019; Múrria et al., 2015; Papadopoulou et al., 2011; Salces‐Castellano et al., 2020; Scalercio et al., 2020). Together with advances in both (a) the generation of community‐level metabarcode data and (b) the recovery of reliable intraspecific sequence variation (Andújar et al, 2020; Elbrecht, Vamos, Steinke, & Leese, 2018; Turon, Antich, Palacín, Præbel, & Wangensteen, 2019), the logistical constraints for site‐based community barcoding are greatly reduced, even for species‐rich and hyperdiverse assemblages (Arribas, Andújar, Salces‐Castellano, Emerson, & Vogler, 2020). Site‐based community metabarcoding thus presents itself as an exciting opportunity, particularly given the potential for revealing otherwise hidden patterns, as exemplified by Scalercio et al.…”
Section: Figurementioning
confidence: 99%