2023
DOI: 10.1099/mgen.0.000949
|View full text |Cite
|
Sign up to set email alerts
|

From defaults to databases: parameter and database choice dramatically impact the performance of metagenomic taxonomic classification tools

Abstract: In metagenomic analyses of microbiomes, one of the first steps is usually the taxonomic classification of reads by comparison to a database of previously taxonomically classified genomes. While different studies comparing metagenomic taxonomic classification methods have determined that different tools are ‘best’, there are two tools that have been used the most to-date: Kraken (k-mer-based classification against a user-constructed database) and MetaPhlAn (classification by alignment to clade-specific marker g… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
31
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(36 citation statements)
references
References 71 publications
(180 reference statements)
1
31
0
Order By: Relevance
“…Given that many studies and microbiome-based diagnostics prioritize composite metrics (e.g., SD) (2729), both 16S-based approaches tested in our workflow offer a suitable and cost-effective alternative for characterizing alpha diversity metrics. In terms of inter-sample diversity, and in agreement with previously published studies (30), we observed disparities in taxonomic profiling among the various sequencing technologies (31, 32) and bioinformatics pipelines (33, 34), but the magnitude of these differences remained low (15.4% and 13.9% variability for primers A and B, respectively). While the reasons for differences observed between 16S-based and shotgun-based profiling have been described elsewhere (30), a significant proportion of this variability can be explained with the inability to perfectly match the taxonomies derived from the different analytical pipelines used in this study.…”
Section: Discussionsupporting
confidence: 91%
“…Given that many studies and microbiome-based diagnostics prioritize composite metrics (e.g., SD) (2729), both 16S-based approaches tested in our workflow offer a suitable and cost-effective alternative for characterizing alpha diversity metrics. In terms of inter-sample diversity, and in agreement with previously published studies (30), we observed disparities in taxonomic profiling among the various sequencing technologies (31, 32) and bioinformatics pipelines (33, 34), but the magnitude of these differences remained low (15.4% and 13.9% variability for primers A and B, respectively). While the reasons for differences observed between 16S-based and shotgun-based profiling have been described elsewhere (30), a significant proportion of this variability can be explained with the inability to perfectly match the taxonomies derived from the different analytical pipelines used in this study.…”
Section: Discussionsupporting
confidence: 91%
“…For taxonomic analysis, to capture the full diversity of WHS mice gut microbiome, we used the full NCBI/RefSeq prokaryote genome sequence database (NCBI RefSeq Complete V205) (24).…”
Section: Selective Breeding For Active Tameness Did Not Affect Taxono...mentioning
confidence: 99%
“…0. (1189 GB)) (24) to ensure comprehensive species identification within the gut samples. Default parameters were used for Kraken 2, with a confidence score threshold set at 0.15.…”
Section: Quality Control Of Metagenomic Sequencesmentioning
confidence: 99%
“…Consequently, the size of reference databases of taxonomic classifiers is also growing, often outpacing the computational capacity available to researchers. In fact, while this was one of the main motivations behind classifiers such as Kraken2 (Wood, Lu, and Langmead 2019), these algorithmic techniques are already becoming insufficient (Wright, Comeau, and Langille 2023).…”
Section: Introductionmentioning
confidence: 99%