Abstract. The diversity of an ensemble can be calculated in a variety of ways. Here a diversity metric and a means for altering the diversity of an ensemble, called "thinning", are introduced. We evaluate thinning algorithms on ensembles created by several techniques on 22 publicly available datasets.When compared to other methods, our percentage correct diversity measure algorithm shows a greater correlation between the increase in voted ensemble accuracy and the diversity value. Also, the analysis of different ensemble creation methods indicates each has varying levels of diversity. Finally, the methods proposed for thinning again show that ensembles can be made smaller without loss in accuracy.
The diversity of an ensemble of classifiers can be calculated in a variety of ways. Here a diversity metric and a means for altering the diversity of an ensemble, called ''thinning'', are introduced. We evaluate thinning algorithms created by several techniques on 22 publicly available datasets. When compared to other methods, our percentage correct diversity measure shows a greatest correlation between the increase in voted ensemble accuracy and the diversity value. Also, the analysis of different ensemble creation methods indicates that they generate different levels of diversity. Finally, the methods proposed for thinning show that ensembles can be made smaller without loss in accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.