2011 IEEE 11th International Conference on Data Mining 2011
DOI: 10.1109/icdm.2011.129
|View full text |Cite
|
Sign up to set email alerts
|

Semi-supervised Feature Importance Evaluation with Ensemble Learning

Abstract: Abstract-We consider the problem of using a large amount of unlabeled data to improve the efficiency of feature selection in high dimensional datasets, when only a small set of labeled examples is available. We propose a new semi-supervised feature importance evaluation method (SSFI for short), that combines ideas from co-training and random forests with a new permutation-based out-of-bag feature importance measure. We provide empirical results on several benchmark datasets indicating that SSFI can lead to sig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0

Year Published

2013
2013
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 24 publications
0
3
0
Order By: Relevance
“…For the production of distinct data sets, a blend of resampling and random subspace might be utilized. Several classifiers are trained, and then their output results are combined in ensemble learning methods [245].…”
Section: ) Semi-supervised Wrapper Methodsmentioning
confidence: 99%
“…For the production of distinct data sets, a blend of resampling and random subspace might be utilized. Several classifiers are trained, and then their output results are combined in ensemble learning methods [245].…”
Section: ) Semi-supervised Wrapper Methodsmentioning
confidence: 99%
“…T represents the redundancy threshold, and our goal is to select the top 20% of non-redundant augmented features, which start as an empty set. So, by the end of step 5, after 6 iterations, the features selected by RSSFS, i.e., ssl_set, and the supervised set, i.e., sl_set, are mentioned or selected, and their column indices are as follows: ssl_set=[ [1,3,7,11,13,21], [1,3,10,12,20,22], [1,3,9,11,23,25], [1,3,7,19,17,30], [1,3,7,11,20,29], [1,5,7,11,23,30] ] sl_set= [[3.7,11,17,21,28], [3.7,11,17,21,28], [3.7,11,17,23,19], [3.7,11,18,25,29], [3.7,9,16,27,22], [3.6,10,12,24,26]] freq_ssl_set= [1,3,7,11,23,30] freq_sl_set= [3.7,11,17,21,28] F...…”
Section: Case Studymentioning
confidence: 99%
“…So, by the end of step 5, after 6 iterations, the features selected by RSSFS, i.e., ssl_set, and the supervised set, i.e., sl_set, are mentioned or selected, and their column indices are as follows: ssl_set=[ [1,3,7,11,13,21], [1,3,10,12,20,22], [1,3,9,11,23,25], [1,3,7,19,17,30], [1,3,7,11,20,29], [1,5,7,11,23,30] ] sl_set= [[3.7,11,17,21,28], [3.7,11,17,21,28], [3.7,11,17,23,19], [3.7,11,18,25,29], [3.7,9,16,27,22], [3.6,10,12,24,26]] freq_ssl_set= [1,3,7,11,23,30] freq_sl_set= [3.7,11,17,21,28] Find the similarity between the frequent feature sets of semi-supervised and supervised. similar features=…”
Section: Case Studymentioning
confidence: 99%
“…SL semisupervised feature selection methods can be categorized into five classes: graph‐based semisupervised feature selection, an example of which can be found in Cheng, Deng, Fu, Wang, and Qin's () study, self‐training‐based semisupervised feature selection (Bellal, Elghazel, & Aussem, ), cotraining‐based semisupervised feature selection (Barkia, Elghazel, & Aussem, ), SVM‐based semisupervised feature selection (Ang, Haron, & Hamed, ), and other semisupervised feature selection methods. Majority of semisupervised feature selection methods construct a graph using the training samples that correspond to graph‐based semisupervised learning methods.…”
Section: Multilabel Feature Selectionmentioning
confidence: 99%