2020
DOI: 10.1016/j.jspi.2019.04.004
|View full text |Cite
|
Sign up to set email alerts
|

Estimating a sharp convergence bound for randomized ensembles

Abstract: When randomized ensembles such as bagging or random forests are used for binary classification, the prediction error of the ensemble tends to decrease and stabilize as the number of classifiers increases. However, the precise relationship between prediction error and ensemble size is unknown in practice. In the standard case when classifiers are aggregated by majority vote, the present work offers a way to quantify this convergence in terms of "algorithmic variance," i.e. the variance of prediction error due o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 28 publications
0
5
0
Order By: Relevance
“…The third term in the bound corresponds to the error contribution from working with a finite ensemble size m. Note that the bound is non-asymptotic and holds for any finite ensemble size m, and one may set m to be of order n. The excess risk guarantee improves as the ensemble size m grows, at a speed that matches the asymptotically optimal rate m −1 determined in recent work of Lopes (2020).…”
Section: Main Upper Boundmentioning
confidence: 90%
See 1 more Smart Citation
“…The third term in the bound corresponds to the error contribution from working with a finite ensemble size m. Note that the bound is non-asymptotic and holds for any finite ensemble size m, and one may set m to be of order n. The excess risk guarantee improves as the ensemble size m grows, at a speed that matches the asymptotically optimal rate m −1 determined in recent work of Lopes (2020).…”
Section: Main Upper Boundmentioning
confidence: 90%
“…However, we are interested in high probability bounds to reveal more information about the worst case behaviour for the excess risk, subject to a failure probability. In another line of research, recent work by Lopes (2020) determined the asymptotic speed of convergence as the number of predictors in the ensemble grows. While this is informative for very large ensembles, we are interested in non-asymptotic guarantees for ensembles of any given finite size.…”
Section: Introductionmentioning
confidence: 99%
“…There is also some recent general theoretical work on understanding ensemble methods. Lopes (2019b) derives the rate at which the test error of a finite ensemble approaches its infinite simulation counterpart. Lopes (2019a) proposes a bootstrap method to approximate the variance of an ensemble, with a view to ascertain how large an ensemble is needed.…”
Section: The Random-projection Ensemble Classifiermentioning
confidence: 99%
“…Algorithmic convergence has been studied for randomized ensembles to analyze the effect of the ensemble size B on prediction error; see Cannings and Samworth [2017] and Lopes [2020]. Define…”
Section: Prediction Error Based On Rbaggingmentioning
confidence: 99%
“…Rbagging is essentially a weighted randomized ensemble method. We further quantify the convergence in terms of the algorithmic variance leveraging on recent results in Cannings and Samworth [2017] and Lopes [2020], showing that the algorithmic variance decreases as the number of bootstrap samples increases.…”
Section: Introductionmentioning
confidence: 99%