2019
DOI: 10.1186/s12863-018-0710-z
|View full text |Cite
|
Sign up to set email alerts
|

funbarRF: DNA barcode-based fungal species prediction using multiclass Random Forest supervised learning model

Abstract: BackgroundIdentification of unknown fungal species aids to the conservation of fungal diversity. As many fungal species cannot be cultured, morphological identification of those species is almost impossible. But, DNA barcoding technique can be employed for identification of such species. For fungal taxonomy prediction, the ITS (internal transcribed spacer) region of rDNA (ribosomal DNA) is used as barcode. Though the computational prediction of fungal species has become feasible with the availability of huge v… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(30 citation statements)
references
References 44 publications
0
29
0
Order By: Relevance
“…However, automated verification can be achieved through phylogeny-based analysis of metabarcoding reads that compute statistical support values for alternative placements. This can be achieved either through local alignments of BLAST hits under a Bayesian framework (Munch et al 2008;Porter and Golding 2011), with a probabilistic approach such as PROTAX Fungi (Abarenkov et al 2018), through a "random forest" learning tool (Meher et al 2019), or through read placement into a separately established reference tree (Berger et al 2011;Matsen et al 2012;Barbera et al 2019).…”
Section: Blast Mappingmentioning
confidence: 99%
See 1 more Smart Citation
“…However, automated verification can be achieved through phylogeny-based analysis of metabarcoding reads that compute statistical support values for alternative placements. This can be achieved either through local alignments of BLAST hits under a Bayesian framework (Munch et al 2008;Porter and Golding 2011), with a probabilistic approach such as PROTAX Fungi (Abarenkov et al 2018), through a "random forest" learning tool (Meher et al 2019), or through read placement into a separately established reference tree (Berger et al 2011;Matsen et al 2012;Barbera et al 2019).…”
Section: Blast Mappingmentioning
confidence: 99%
“…While the Bayesian framework in pplacer offers direct assessment of statistical confidence, the EPA allows the computing of bootstrap support values for potential alternative read placements. These options provide an automated, quantitative verification step not available through OTU clustering or BLAST mapping, except with approaches such as PROTAX Fungi and "random forest" learning (Abarenkov et al 2018;Meher et al 2019). Optionally, prior to invoking the EPA, the phylogenetic pattern of the metabarcoding marker over the fixed reference alignment can be analyzed using a maximum parsimony or maximum likelihood approach in order to compute a weight vector.…”
Section: Multiple Alignment-based Read Placementmentioning
confidence: 99%
“…Simple-logistic, IBK, PART, Attribute-selected Classifier, Bagging approaches were also implemented in another study [49] in this regard. SMO, BP-NN [50], RF [51], k-mer based approaches [52] [53] can also be perceived in recent studies. Naïve Bayes is also applied on COI [54] barcode database demonstrating misclassification rates and also on ribosomal databases [55] effectively.…”
Section: Related Workmentioning
confidence: 99%
“…Several rRNA genes have been success-fully employed for fungal species identification, including the small ribosomal subunit, the large ribosomal subunit, the RNA polymerase II binding protein, and the internal transcribed spacer (ITS). Among these, the ITS (including ITS1 and ITS2 separated by the 5.8S genic region) has been widely adopted as a marker for fungal identification and diversity exploration [15][16][17][18][19] because this region is ubiquitous and shows great variation in sequence and length [9].…”
Section: Introductionmentioning
confidence: 99%
“…The Warcup training set was developed from the UNITE database and includes only sequences with authoritative taxonomic or lineage information [9]. In addition, ITS barcodes in the BOLD database (http://www.boldsystems.org/) [22] and the ITS1 database comprising sequences of NCBI GenBank (http:// www.ncbi.nlm.nih.gov/) have been used for fungal species identification [10,15]. DNA barcode-based taxonomic assignment can be achieved by using similarity-based or prediction-based (alignment-free) methods.…”
Section: Introductionmentioning
confidence: 99%