2021
DOI: 10.3389/fmicb.2021.644487
|View full text |Cite
|
Sign up to set email alerts
|

Beating Naive Bayes at Taxonomic Classification of 16S rRNA Gene Sequences

Abstract: Naive Bayes classifiers (NBC) have dominated the field of taxonomic classification of amplicon sequences for over a decade. Apart from having runtime requirements that allow them to be trained and used on modest laptops, they have persistently provided class-topping classification accuracy. In this work we compare NBC with random forest classifiers, neural network classifiers, and a perfect classifier that can only fail when different species have identical sequences, and find that in some practical scenarios … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
18
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(19 citation statements)
references
References 31 publications
1
18
0
Order By: Relevance
“…Therefore, the naive bayes requires more sample to train compared to the neural network in this work may be because of the data used. Similar results that Naive Bayes classifier outperformed the neural network classifier have been reported in different applications [ 48 ] (see [ 49 ] for a review). Additionally, sample size significantly influences the performance of classifiers, as seen in our results and in a review study that the large sample sizes depict relatively precise and similar accuracies among classifiers [ 47 ].…”
Section: Discussionsupporting
confidence: 82%
“…Therefore, the naive bayes requires more sample to train compared to the neural network in this work may be because of the data used. Similar results that Naive Bayes classifier outperformed the neural network classifier have been reported in different applications [ 48 ] (see [ 49 ] for a review). Additionally, sample size significantly influences the performance of classifiers, as seen in our results and in a review study that the large sample sizes depict relatively precise and similar accuracies among classifiers [ 47 ].…”
Section: Discussionsupporting
confidence: 82%
“…The Quantitative Insights into Microbial Ecology 2 (QIIME 2 version 2021.8) [20] was used to quality lter, denoise and analyze the sequences. Demultiplexed paired-end sequence reads were denoised with DADA2 into amplicon sequence variants (ASV).…”
Section: Discussionmentioning
confidence: 99%
“…Feature table and feature data summaries were generated to determine the sequence distribution per sample as well as a-diversity (abundance of ASV within a sample) and bdiversity (microbial composition between samples) analysis of the samples. Taxonomic classi cations were performed using Naïve Bayesian classi er [21] against the SILVA database release 138 [22] trimmed to the V3-V4 region of the 16S rRNA gene.…”
Section: Discussionmentioning
confidence: 99%