2006
DOI: 10.1177/0049124105283119
|View full text |Cite
|
Sign up to set email alerts
|

An Introduction to Ensemble Methods for Data Analysis

Abstract: This paper provides an introduction to ensemble statistical procedures as a special case of algorithmic methods. The discussion beings with classification and regression trees (CART) as a didactic device to introduce many of the key issues. Following the material on CART is a consideration of cross-validation, bagging, random forests and boosting. Major points are illustrated with analyses of real data.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
109
0
1

Year Published

2011
2011
2017
2017

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 125 publications
(119 citation statements)
references
References 25 publications
0
109
0
1
Order By: Relevance
“…Our decision here to utilise random forest technique is informed by evidence in the social science literature that ensemble methods carry the potential for additional roles on top of the more common classification and forecasting roles, namely, serving as diagnostics tools to conventional methods, and providing further insight into the relation between the explanatory and response variables [39]. Moreover, their conceptual resemblance to wider adopted methods such as regression trees [27], make them a suitable candidate to introduce as a data-driven methodology into the process.…”
Section: Enhancing the Workflowmentioning
confidence: 99%
“…Our decision here to utilise random forest technique is informed by evidence in the social science literature that ensemble methods carry the potential for additional roles on top of the more common classification and forecasting roles, namely, serving as diagnostics tools to conventional methods, and providing further insight into the relation between the explanatory and response variables [39]. Moreover, their conceptual resemblance to wider adopted methods such as regression trees [27], make them a suitable candidate to introduce as a data-driven methodology into the process.…”
Section: Enhancing the Workflowmentioning
confidence: 99%
“…This tree can be used for prediction, by 'dropping' new observations down the tree (Breiman et al, 1984). For a more extensive description of the CART algorithm, see Berk (2006) or Strobl, Malley, and Tutz (2009). Gigerenzer and Goldstein (1996) and Gigerenzer et al (1999) have suggested CART as a powerful algorithm for the creation of simple decision making tools, because CART trees, like FFTs evaluate one cue at a time in order to arrive at a final decision.…”
Section: Classification and Regression Treesmentioning
confidence: 99%
“…RuleFit is a so-called ensemble method (e.g., Berk, 2006): it combines the predictions of multiple simple prediction functions to make a final prediction. The RuleFit model, as most learning ensembles, takes the form…”
Section: Rulefit Algorithmmentioning
confidence: 99%
“…As noted there, ensemble learning focuses on the handling of the output of two or more intelligent systems in parallel. The reviews provided by [2,3] are recommended for their particular value in understanding ensemble methods.…”
Section: I E N S E M B L E a R C H I T E C T U R E Smentioning
confidence: 99%