2020
DOI: 10.1101/2020.10.05.326504
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

RESCRIPt: Reproducible sequence taxonomy reference database management for the masses

Abstract: BackgroundNucleotide sequence and taxonomy reference databases are critical resources for widespread applications including marker-gene and metagenome sequencing for microbiome analysis, diet metabarcoding, and environmental DNA (eDNA) surveys. Reproducibly generating, managing, using, and evaluating nucleotide sequence and taxonomy reference databases creates a significant bottleneck for researchers aiming to generate custom sequence databases. Furthermore, database composition drastically influences results,… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
74
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 102 publications
(86 citation statements)
references
References 119 publications
(195 reference statements)
0
74
0
1
Order By: Relevance
“…Hence, mature, existing methods for classification (NBC and some alignment-based classifiers) have already neared the upper limits of classification accuracy. The relationship between read length, primer selection, marker-gene target, sequence entropy, and taxonomic resolution has been well documented for 16S rRNA genes and other common targets, and even with long sequence reads (e.g., full-length 16S rRNA genes) species-level resolution can be challenging for many clades (Wang et al, 2007;Liu et al, 2008;Bokulich et al, 2018b;Johnson et al, 2019;Robeson et al, 2020). This is in part complicated by muddled microbial taxonomies (Oren and Garrity, 2014;Yarza et al, 2014) and misannotations and other issues with reference databases used for taxonomic classification (Kozlov et al, 2016).…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Hence, mature, existing methods for classification (NBC and some alignment-based classifiers) have already neared the upper limits of classification accuracy. The relationship between read length, primer selection, marker-gene target, sequence entropy, and taxonomic resolution has been well documented for 16S rRNA genes and other common targets, and even with long sequence reads (e.g., full-length 16S rRNA genes) species-level resolution can be challenging for many clades (Wang et al, 2007;Liu et al, 2008;Bokulich et al, 2018b;Johnson et al, 2019;Robeson et al, 2020). This is in part complicated by muddled microbial taxonomies (Oren and Garrity, 2014;Yarza et al, 2014) and misannotations and other issues with reference databases used for taxonomic classification (Kozlov et al, 2016).…”
Section: Discussionmentioning
confidence: 99%
“…3. Improvement of reference sequence and taxonomy databases (Parks et al, 2018;Robeson et al, 2020). 4.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Amplicon sequence variants (ASVs) were generated by DADA2 analysis, which were then classified to family and genus levels using the q2-feature-classifier [20], a Naïve Bayes machine learning classifier plugin in the QIIME2. Operational taxonomic units (OTUs) were generated by the RESCRIPt QIIME2 plugin running a feature classifier trained on the V3–V4 region of the 16S rRNA gene using a preformatted SILVA 138 reference database [21, 22]. An equal sampling depth of 10,000 was selected for every sample for assessing the diversities.…”
Section: Methodsmentioning
confidence: 99%