Ensemble Methods: Foundations and Algorithms [Book Review]

Schwenker, Friedhelm

doi:10.1109/mci.2012.2228600

Cited by 52 publications

(33 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…To evaluate the power of the candidate features, it is necessary to calculate the proportion of the revenue increase coalitions according to Eqs. (2) and (1). Theoretically, calculating the Shapley value requires summing over all possible feature subsets, which may lead to high computational complexity.…”

Section: Computational Complexitymentioning

confidence: 99%

See 1 more Smart Citation

Random Shapley Forests: Cooperative Game-Based Random Forests With Consistency

Sun

Zhong

et al. 2022

IEEE Trans. Cybern.

View full text Add to dashboard Cite

The original random forests algorithm has been widely used and has achieved excellent performance for the classification and regression tasks. However, the research on the theory of random forests lags far behind its applications. In this paper, to narrow the gap between the applications and theory of random forests, we propose a new random forests algorithm, called random Shapley forests (RSFs), based on the Shapley value. The Shapley value is one of the well-known solutions in the cooperative game, which can fairly assess the power of each player in a game. In the construction of RSFs, RSFs uses the Shapley value to evaluate the importance of each feature at each tree node by computing the dependency among the possible feature coalitions. In particular, inspired by the existing consistency theory, we have proved the consistency of the proposed random forests algorithm. Moreover, to verify the effectiveness of the proposed algorithm, experiments on eight UCI benchmark datasets and four real-world datasets have been conducted. The results show that RSFs perform better than or at least comparable with the existing consistent random forests, the original random forests and a classic classifier, support vector machines.

show abstract

Section: Computational Complexitymentioning

confidence: 99%

“…E NSEMBLE methods are learning algorithms that construct and combine a set of classifiers to classify new unseen data [1]. They tend to use multiple learning algorithms for better predictive performance compared with any other constituent learning algorithms alone [2][3][4][5].…”

Section: Introductionmentioning

confidence: 99%

Random Shapley Forests: Cooperative Game-Based Random Forests With Consistency

Sun

Zhong

et al. 2022

IEEE Trans. Cybern.

View full text Add to dashboard Cite

show abstract

“…Kelebihan dari penggunaan teknik bagging adalah dapat mengurangi varians dari algoritma dengan penyesuaian antara estimasi dan hasil yang diinginkan dari peningkatan akurasi suatu model [12]. Prediksi dari out-of-bag menunjukan H(X) pada vektor X. Learner yang tidak dilatih X yang akan terlibat pada prosesnya [13]. Adapun rumusnya adalah sebagai berikut: dimana X adalah vector, x adalah variabel, y adalah output spaces, N adalah data sample, T adalah jumlah learner (t=1,....,T), H adalah learner dan (.)…”

Section: Tinjauan Pustaka a Baggingunclassified

Implementasi Teknik Bagging untuk Peningkatan Kinerja J48 dan Logistic Regression dalam Prediksi Minat Pembelian Online

Rahmawati

Agustina

2020

J-TIT

View full text Add to dashboard Cite

Abstract—The rapid growth of online shopping sites makes business in the virtual world very promising. Purchasing intentions is one of the keys to success in an online store. There are several data mining methods for making predictions on online purchase intentions datasets. Data can represent the characteristics or habits of each user who has visited a site whether it ends with a transaction or not. Some popular algorithms with good performance in data mining include J48 and Logistic Regression. However, in data sometimes there is a problem of class imbalance, so the ensemble technique needs to be applied. One technique that can be applied is bagging. This research examines data using bagging techniques to improve the performance of the J48 algorithm and Logistic Regression. The results of improving the performance of data mining algorithms with these techniques have an accuracy value of 89.68% for the J48 algorithm and 88.50% for the Logistic Regression algorithm. This figure shows an increase when compared with initial testing without using ensemble techniques. Increases were also experienced in Recall, F-Measure, and AUC values. Keywords—purchasing intentions; J48; Logistic Regression; Bagging; Abstrak— Pesatnya situs pembelanjaan online menjadikan bisnis di dunia virtual sangat menjanjikan. Minat pembelian menjadi salah satu kunci kesuksesan pada sebuah toko online. Terdapat beberapa metode data mining untuk melakukan prediksi pada dataset minat pembelian online. Data dapat mewakili karakteristik atau kebiasaan dari setiap user yang telah mengunjungi suatu situs baik berakhir dengan melakukan transaksi ataupun tidak. Beberapa algoritma yang populer dengan kinerja yang baik dalam data mining diantaranya J48 dan Logistic Regreession. Namun, dalam sebuah data terkadang terdapat masalah ketidakseimbangan kelas, sehingga perlu diterapkan teknik ensemble. Salah satu teknik yang dapat diterapkan adalah teknik bagging. Penelitian kali ini mengujikan data dengan teknik bagging untuk meningkatkan kinerja algoritma J48 dan Logistic Regression. Hasil dari peningkatan kinerja algoritma data mining dengan teknik tersebut memiliki nilai akurasi 89.68% untuk algoritma J48 dan 88.50% untuk algoritma Logistic Regression. Angka tersebut menunjukan adanya peningkatan jika dibandingkan dengan pengujian awal tanpa menggunakan teknik ensemble. Peningkatan juga dialami pada nilai Recall, F-Measure, dan AUC. Keywords—Minat Pembelian, J48, Logistic Regression, Bagging

show abstract

“…Secara umum algoritma AdaBoost melatih pengklasifikasian dasar secara sekuensial dalam setiap iterasi menggunakan data latih dengan koefisien bobot yang bergantung dari performa pengklasifikasian pada iterasi sebelumnya untuk memberikan bobot yang lebih besar pada data yang salah terklasifikasi (Schapire & Freund, 2013), (Schwenker, 2013).…”

Section: Adaptive Boosting (Adaboost)unclassified

Klasifikasi Pelanggan Deposito Potensial menggunakan Ensembel Least Square Support Vector Machine

Aziz

Jeffry

2020

JSCE

View full text Add to dashboard Cite

Jumlah data yang sangat banyak pada industri perbankan sangat susah bahkan mustahil untuk dianalisis secara manual untuk mendapatkan suatu informasi yang berguna untuk menentukan suatu kebijakan. Oleh karena itu, penggunaan data mining diharapkan dapat memberikan kontribusi dalam mengolah data tersebut. Berbagai metode telah banyak digunakan untuk mengklasifikasi suatu data, salah satunya adalah metode support vector machine. Penelitian ini bertujuan untuk melakukan klasifikasi terhadap nasabah yang berpotensi berlangganan deposito pada bank marketing dataset. Fokus penelitian ini mengusulkan pengembangan dari metode support vector machine yaitu metode least square support vector machine kemudian di ensemble menggunakan boosting. Data yang akan diolah adalah bank marketing dataset. Hasil menunjukkan bahwa metode yang diusulkan yakni ensemble least square support vector machine lebih baik dibandingkan dengan metode lainnya dengan persentase tingkat accuracy, sensitivity, specivicity masing-masing adalah 95.15%, 92.93%, 97.61% dengan total rata-rata hasil klasifikasi sebesar 95.23%.

show abstract

Ensemble Methods: Foundations and Algorithms [Book Review]

Cited by 52 publications

References 0 publications

Random Shapley Forests: Cooperative Game-Based Random Forests With Consistency

Random Shapley Forests: Cooperative Game-Based Random Forests With Consistency

Implementasi Teknik Bagging untuk Peningkatan Kinerja J48 dan Logistic Regression dalam Prediksi Minat Pembelian Online

Klasifikasi Pelanggan Deposito Potensial menggunakan Ensembel Least Square Support Vector Machine

Contact Info

Product

Resources

About