2009
DOI: 10.1186/1471-2105-10-213
|View full text |Cite
|
Sign up to set email alerts
|

A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data

Abstract: Background: Regularized regression methods such as principal component or partial least squares regression perform well in learning tasks on high dimensional spectral data, but cannot explicitly eliminate irrelevant features. The random forest classifier with its associated Gini feature importance, on the other hand, allows for an explicit feature elimination, but may not be optimally adapted to spectral data due to the topology of its constituent classification trees which are based on orthogonal splits in fe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
529
0
6

Year Published

2012
2012
2022
2022

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 954 publications
(584 citation statements)
references
References 40 publications
4
529
0
6
Order By: Relevance
“…For any random forest classifier, the feature selection can be visualized. Using the "Gini importance" (24) to rank features over all ASTER images of this study (Fig. 2, Bottom Left), we find the information of ASTER channels 3, 5, and 6 to be the most important during training.…”
Section: Mapping Anthrosols At a Large Scalementioning
confidence: 99%
See 1 more Smart Citation
“…For any random forest classifier, the feature selection can be visualized. Using the "Gini importance" (24) to rank features over all ASTER images of this study (Fig. 2, Bottom Left), we find the information of ASTER channels 3, 5, and 6 to be the most important during training.…”
Section: Mapping Anthrosols At a Large Scalementioning
confidence: 99%
“…The anthrosol probability maps of observations A-C are shown in the right column (black 100%, white 0%; 50% threshold outlined red). The bottom row shows the feature ranking for all ASTER images of the study using the Gini importance (22,24) as measure (boxes represent quartiles; black lines median; green crosses observations A-C). The final probability map-also shown in Fig.…”
Section: Mapping Anthrosols At a Large Scalementioning
confidence: 99%
“…More applications of random forests can be found in other different fields like quantitative structure-activity relationship modeling [42], nuclear magnetic resonance spectroscopy [31], or clinical decision supports in medicine in general [11].…”
Section: Some Other Related Applicationsmentioning
confidence: 99%
“…Other applications include prediction of patient outcome from highdimensional gene expression data [24,66,5] or proteomic mass spectra classification [30,52], where patients are instances and their outcome is the response to be predicted. Another class of applications deals with the prediction of molecule properties based on sequence information, e.g.…”
Section: Rf Applications In Bioinformatics: Some Examplesmentioning
confidence: 99%