2013 IEEE International Conference on Systems, Man, and Cybernetics 2013
DOI: 10.1109/smc.2013.177
|View full text |Cite
|
Sign up to set email alerts
|

An Information-Theoretic Approach for Setting the Optimal Number of Decision Trees in Random Forests

Abstract: Data Classification is a process within the Data Mining and Machine Learning field which aims at annotating all instances of a dataset by so-called class labels. This involves in creating a model from a training set of data instances which are already labeled, possibly being this model also used to define the class of data instances which are not classified already. A successful way of performing the classification process is provided by the algorithm Random Forests (RF), which is itself a type of Ensemble-bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(8 citation statements)
references
References 13 publications
0
8
0
Order By: Relevance
“…To optimize the number of trees while keeping the classification accuracy close to or higher than that of the original RF algorithm, Cuzzocrea et al proposed a new algorithm 7end while (8) end procedure Algorithm 1: Overall Operation. [18]. Based on the relationship between the predictive power which means the percentage of positively classified cases of instances of that dataset and the number of trees in a forest, they proposed how to optimize the number of trees in RF using an information-theoretic approach.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…To optimize the number of trees while keeping the classification accuracy close to or higher than that of the original RF algorithm, Cuzzocrea et al proposed a new algorithm 7end while (8) end procedure Algorithm 1: Overall Operation. [18]. Based on the relationship between the predictive power which means the percentage of positively classified cases of instances of that dataset and the number of trees in a forest, they proposed how to optimize the number of trees in RF using an information-theoretic approach.…”
Section: Related Workmentioning
confidence: 99%
“…for all ∈ do (4) Count either , or from the return (5) value of Test4Classification( , ); (6) end for (7) = /( + ); (8) = /( + ); (9) = F Measure( , ); (10) ← ; (11) else (12) for all ∈ do (13) for all 1 ≤ ≤ do (14) Count either , or from the return (15) value of Test4Classification( , ); (16) end for (17) end for (18) = /( + ); (19) = node based on the best attribute is included into (Line (6)). Based on the best attribute, C4.5 splits and, thus, generates .…”
Section: Oob Dataset Dmentioning
confidence: 99%
See 2 more Smart Citations
“…Accuracy is simply measured by the possibility that the algorithm can predict negative and positive instances correctly. 34,35 as:…”
Section: Performance Measuresmentioning
confidence: 99%