2018
DOI: 10.1111/eva.12604
|View full text |Cite
|
Sign up to set email alerts
|

Optimization and performance testing of a sequence processing pipeline applied to detection of nonindigenous species

Abstract: Genetic taxonomic assignment can be more sensitive than morphological taxonomic assignment, particularly for small, cryptic or rare species. Sequence processing is essential to taxonomic assignment, but can also produce errors because optimal parameters are not known a priori. Here, we explored how sequence processing parameters influence taxonomic assignment of 18S sequences from bulk zooplankton samples produced by 454 pyrosequencing. We optimized a sequence processing pipeline for two common research goals,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
30
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 25 publications
(31 citation statements)
references
References 46 publications
0
30
0
Order By: Relevance
“…A suite of programs facilitates the identification and removal of both PCR and sequencing errors (reviewed in Coissac et al 2012). Thus, sequence processing is often performed to reduce technical errors, but the bioinformatics steps can be an important source of error because appropriate parameters are often not known (Clare et al 2016, Scott et al 2018. Flynn et al (2015) evaluated the performance of commonly used clustering methods on a series of mock communities and on more complex natural communities of zooplankton.…”
Section: Major Bioinformatics Challengesmentioning
confidence: 99%
See 1 more Smart Citation
“…A suite of programs facilitates the identification and removal of both PCR and sequencing errors (reviewed in Coissac et al 2012). Thus, sequence processing is often performed to reduce technical errors, but the bioinformatics steps can be an important source of error because appropriate parameters are often not known (Clare et al 2016, Scott et al 2018. Flynn et al (2015) evaluated the performance of commonly used clustering methods on a series of mock communities and on more complex natural communities of zooplankton.…”
Section: Major Bioinformatics Challengesmentioning
confidence: 99%
“…The performance of bioinformatics pipelines can be tested through simulation. For example, Scott et al (2018) tested 1,050 combinations of parameters to determine the optimal parameter sets for particular research goals. They then tested the pipeline performance (detectability and sensitivity) by computationally inoculating sequences from 20 aquatic invasive species into 10 zooplankton community samples, revealing that optimal parameter selection often depends on the research goal.…”
Section: Major Bioinformatics Challengesmentioning
confidence: 99%
“…OTU tables, a list of OTUs obtained for each sample and the number of sequences assigned to them, were constructed, clustering reads with a 100% identity between them and maintaining all assigned sequences including singletons to retain maximum sensitivity for species detection. The removal of singletons is usually employed to eliminate false positives as proposed by Scott et al (2018), in the context of species survival, NIS early detection, or marine biosecurity surveillance; a false negative is most costly than a false positive (von Ammon et al, 2018). Sequences of organisms without relevance for the study (e.g., human, insects, terrestrial plants, etc.)…”
Section: Dna and Bioinformatics Analysismentioning
confidence: 99%
“…reference databases selection for sequences comparison) (Zhan et al 2014, Flynn et al 2015. In our review, we found that different pipelines have been employed in the bioinformatics workflows (Hatzenbuhler et al 2017, Scott et al 2018, von Ammon et al 2018b (Table S5), which further complicates comparison among studies.…”
Section: Bioinformatics Pipelinesmentioning
confidence: 99%
“…However, similarity thresholds can be strongly dependent on the marker and the length of the fragments targeted. While for the 20 18S rRNA gene (more conserved) an increase from 97% to 99% OTUs clustering may allow species to be split, for typical COI fragments, which display high sequence variability, this alteration may not produce any meaningful increase in the number of species detected (Scott et al 2018). The same applies with the length of the fragments under analysis, with shorter fragments being probably more sensitive to similarity thresholds than longer fragments.…”
Section: Bioinformatics Pipelinesmentioning
confidence: 99%