Statistical Analysis of the Performance of Rank Fusion Methods Applied to a Homogeneous Ensemble Feature Ranking

Soheili, Majid; Moghadam, Amir Masoud Eftekhari; Dehghan, Mehdi

doi:10.1155/2020/8860044

Cited by 3 publications

(6 citation statements)

References 67 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The DEIM framework can utilize an arbitrary feature ranking algorithm, such as Fisher, and Gini Index [21], as the base learner in phase three. Nevertheless, algorithms that belong to the filter category would be better than others because they have lower computational costs and more generality [22].…”

Section: Base Feature Ranking Methodsmentioning

confidence: 99%

“…This capability leads to lower execution time than the traditional version, whereas the outcome was not destroyed. Moreover, the author's other work investigated the various rank combination methods in an ensemble feature selection approach [21]. Their proposed method focused on scalability and combining feature rankings, whereas coping with imbalanced datasets was not investigated.…”

Section: Related Workmentioning

confidence: 99%

“…At the end of phase three, each data partition is transformed to the b number of intermediate feature rankings, which should be reduced to a final feature ranking by applying a rank fusion method. The rank fusion is known by various names, such as Rank Combination and Rank Aggregation as well, and some methods such as Borda, Kwik-Sort, and Stuart are introduced in this field [21]. Due to some advantages, the proposed framework applies an Ordered Weighted Averaging (OWA) operator to reduce the intermediate feature rankings.…”

Section: Fusing the Intermediate Feature Rankingsmentioning

confidence: 99%

See 2 more Smart Citations

Distributed Ensemble Feature Selection Framework for High-Dimensional and High-Skewed Imbalanced Big Dataset

Soheili

Haeri

2021

2021 IEEE Symposium Series on Computational Intelligence (SSCI)

Self Cite

View full text Add to dashboard Cite

Section: Base Feature Ranking Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Fusing the Intermediate Feature Rankingsmentioning

confidence: 99%

See 1 more Smart Citation

Distributed Ensemble Feature Selection Framework for High-Dimensional and High-Skewed Imbalanced Big Dataset

Soheili

Haeri

2021

2021 IEEE Symposium Series on Computational Intelligence (SSCI)

Self Cite

View full text Add to dashboard Cite

“…The noticeable point is that computing the individual entropy and joint entropy would be feasible by generating the joint value histogram vector. Therefore, if the joint value histogram of two features X i and X j is represented as a square matrix F ∈ N b×b such that f i,j is equal to the frequency of joint values (x i , x j ), the joint and individual entropies will be computed based on equations (11) to (13). Therefore, acquiring the joint value histogram of two features, the MI measure between them can be computed in a single pass.…”

Section: A Similarity Matrix In Theorymentioning

confidence: 99%

“…According to dependency on a classification model, FS algorithms can be categorized into three groups: wrapper methods [8], [9], embedded methods [10], and filter methods [11]. Filter methods only rest on data's statistical properties, and since they are independent of any learning model, they prevent incurring a high computational cost and provide more generality than two other categories [12].…”

Section: Introductionmentioning

confidence: 99%

Scalable Global Mutual Information Based Feature Selection Framework for Large Scale Datasets

Soheili

Haeri

2021

2021 IEEE 25th International Enterprise Distributed Object Computing Conference (EDOC)

View full text Add to dashboard Cite

In the Big Data era, scalability is an essential characteristic of machine learning algorithms. Most data discovery algorithms apply a feature selection (FS) method as a crucial preprocessing step. The main objective of FS is to select a subset of informative features in such a way that the discriminating power will be kept. Unluckily, most traditional feature selection algorithm is not scalable, which is a significant weakness in coping with big datasets. This paper proposes a distributed and Scalable Global Mutual Information-based feature selection framework called SGMI to deal with large-scale datasets.The framework first generates a similarity matrix to representing dependency among all features. To this aim, the joint values histograms of paired columns are generated in a scalable way and a single pass. Next, based on these histograms, the dependency criterion elements, including individual and joint entropies, are extracted independently. Finally, the SGMI framework applies an optimization method to make feature rankings based on the similarity matrix. In this paper, three popular optimization methods, Quadratic Programming (QP), Spectral Relaxation (SR), and Truncated Power (TP), are plugged into the proposed framework. Consequently, three scalable FS methods, called SGMI-QP, SGMI-SR, SGMI-TP, will be produced. The experimental studies are performed on four balanced and imbalanced large-scale datasets. Then, the empirical outcomes are compared with a distributed feature selection method, DiRelief, and the original version of the produced methods.The experimental results illustrate that (i) all produced methods are scalable and have a lower execution time than their traditional version and DiRelief method. (ii) SGMI-QP has a lower execution time than the two others. (iii) There is no significant difference among produced methods outcomes on experimental balanced big datasets. (iv) Generally, SGMI-SR produces better results to cope with big datasets than SGMI-QP, SGMI-TP, and DiRelief.

show abstract

Ensemble learning based on relative accuracy approach and diversity teams

B. Rokaya,

D. Alsufiani

2024

Bulletin EEI

View full text Add to dashboard Cite

Ensemble learning, which involves combining the opinions of multiple experts to arrive at a better result, has been used for centuries. In this work, a review of the major voting methods in ensemble learning is explored. This work will focus on a new method for combining the results of individual learners. The method depends on the relative accuracy and diversity of teams. Instead of trying to assign weight to each different trainer, the concept of diversity teams is presented. Each team will vote as one player; however, the individual accuracies of each learner still be implemented. The concept of relaxing parameters that deal with each team as one player is presented. Our experiments demonstrate that traditional ensemble voting methods outperform individual learners. There is a limit to the superiority of the ensemble learner that any ensemble learner cannot go beyond. The proposed voting method gives the same results as the traditional ensemble voting methods, however, a different diversity of the proposed method from the traditional voting method or for different values of the relaxing parameter can be achieved.

show abstract

Statistical Analysis of the Performance of Rank Fusion Methods Applied to a Homogeneous Ensemble Feature Ranking

Cited by 3 publications

References 67 publications

Distributed Ensemble Feature Selection Framework for High-Dimensional and High-Skewed Imbalanced Big Dataset

Distributed Ensemble Feature Selection Framework for High-Dimensional and High-Skewed Imbalanced Big Dataset

Scalable Global Mutual Information Based Feature Selection Framework for Large Scale Datasets

Ensemble learning based on relative accuracy approach and diversity teams

Contact Info

Product

Resources

About