2015
DOI: 10.3233/ida-150789
|View full text |Cite
|
Sign up to set email alerts
|

Improving Random Forest and Rotation Forest for highly imbalanced datasets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 35 publications
(20 citation statements)
references
References 24 publications
0
20
0
Order By: Relevance
“…Before selecting subsamples, the sets of sample attributes should be segmented and combined randomly to obtain sequences of subsets of sample attributes, of which data can be preprocessed by feature transformation. Compared with Random Forest algorithm, which is the basis of RF, RF algorithm has a better performance on processing high dimensional and small-sample database [ 60 , 61 , 62 ]. The main procedure of building RF model includes: (1) dividing the attribute sets into several subsets; (2) obtaining sample subsets by resampling and making feature transformation on subsets of sample attributes; (3) realigning the rotation matrix according to sequence of original attribute sets; (4) training base classifiers based on the data which have been rotated; and (5) integrating results of various base classifiers and outputting the final forecast category.…”
Section: Methodsologymentioning
confidence: 99%
“…Before selecting subsamples, the sets of sample attributes should be segmented and combined randomly to obtain sequences of subsets of sample attributes, of which data can be preprocessed by feature transformation. Compared with Random Forest algorithm, which is the basis of RF, RF algorithm has a better performance on processing high dimensional and small-sample database [ 60 , 61 , 62 ]. The main procedure of building RF model includes: (1) dividing the attribute sets into several subsets; (2) obtaining sample subsets by resampling and making feature transformation on subsets of sample attributes; (3) realigning the rotation matrix according to sequence of original attribute sets; (4) training base classifiers based on the data which have been rotated; and (5) integrating results of various base classifiers and outputting the final forecast category.…”
Section: Methodsologymentioning
confidence: 99%
“…Rotation Forest has been also applied to the imbalanced problems, for example, Su et al [ 22 ] used Hellinger distance decision tree (HDDT) [ 23 , 24 ] instead of C4.5 to train individual classifiers on whole training set. Hosseinzadeh and Eftekharia [ 25 ] learned Rotation Forest on the data obtained by preprocessing training set using the synthetic oversampling technique (SMOTE) [ 8 ] and fuzzy cluster [ 40 ].…”
Section: Ensemble For Imbalanced Problemmentioning
confidence: 99%
“…The main heuristic is to apply feature extraction and subsequently reconstruct a full feature set for each classifier in the ensemble. This method is also applied to class-imbalance data; for example, Su et al [ 22 ] employed Hellinger distance decision tree (HDDT) [ 23 , 24 ] instead of C4.5 or CART as the base learner of Rotation Forest to deal with class-imbalance issues. Hosseinzadeh and Eftekharia [ 25 ] preprocessed the original data using the fuzzy cluster and synthetic oversampling technique (SMOTE) to obtain the training set on which Rotation Forest is learned.…”
Section: Introductionmentioning
confidence: 99%
“…This method is also applied to imbalanced problems, for example, Su et al [21] employed class imbalance-oriented learner, namely, Hellinger distance decision tree (HDDT), as the base classifier of rotation forest to handle class-imbalanced problem, and each base classifier is constructed on the whole training set. Hosseinzadeh and Eftekharia [22] learned rotation forest on the data obtained by preprocessing training set using synthetic oversampling technique (SMOTE) and fuzzy cluster.…”
Section: Strategies For Imbalanced Medical Datasetsmentioning
confidence: 99%