2016
DOI: 10.1007/s11222-016-9646-1
|View full text |Cite
|
Sign up to set email alerts
|

Correlation and variable importance in random forests

Abstract: This paper is about variable selection with the random forests algorithm in presence of correlated predictors. In high-dimensional regression or classification frameworks, variable selection is a difficult task, that becomes even more challenging in the presence of highly correlated predictors. Firstly we provide a theoretical study of the permutation importance measure for an additive regression model. This allows us to describe how the correlation between predictors impacts the permutation importance. Our re… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

6
359
1
3

Year Published

2016
2016
2024
2024

Publication Types

Select...
10

Relationship

0
10

Authors

Journals

citations
Cited by 656 publications
(446 citation statements)
references
References 36 publications
6
359
1
3
Order By: Relevance
“…Neither can we discuss the issue of variable importance stability in detail [36,47,48]. We rather present a study focusing on the application of the RF approach in the light of forest protection issues.…”
Section: Discussionmentioning
confidence: 99%
“…Neither can we discuss the issue of variable importance stability in detail [36,47,48]. We rather present a study focusing on the application of the RF approach in the light of forest protection issues.…”
Section: Discussionmentioning
confidence: 99%
“…This provides additional flexibility to the RF algorithm (Ma et al, 2006) and allows for favoring sensitivity or specificity to different classes. A variable can be considered a strong predictor when permuting it increases the prediction error (Gregorutti et al, 2017), therefore it's importance I V can be defined as:…”
Section: Evaluation Criteriamentioning
confidence: 99%
“…Although the effect of the correlations on these measures has been studied recently (see Archer and Kimes, 2008;Strobl et al, 2009;Nicodemus and Malley, 2009;Nicodemus et al, 2010;Nicodemus, 2011;Auret and Aldrich, 2011;Tolosi and Lengauer, 2011;Grömping, 2009;Gregorutti et al, 2013), there is not yet a consensus on the interpretation of the importance measures when the predictors are correlated and on what the effect of this correlation is on the importance measure.…”
Section: Variable Importancementioning
confidence: 99%