Proceedings of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management 2015
DOI: 10.5220/0005595502260234
|View full text |Cite
|
Sign up to set email alerts
|

SCUT: Multi-Class Imbalanced Data Classification using SMOTE and Cluster-based Undersampling

Abstract: Vous avez des questions? Nous pouvons vous aider. Pour communiquer directement avec un auteur, consultez la première page de la revue dans laquelle son article a été publié afin de trouver ses coordonnées. Si vous n'arrivez pas à les repérer, communiquez avec nous à PublicationsArchive-ArchivesPublications@nrc-cnrc.gc.ca. Questions? Contact the NRC Publications Archive team atPublicationsArchive-ArchivesPublications@nrc-cnrc.gc.ca. If you wish to email the authors directly, please see the first page of the pub… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
34
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 74 publications
(43 citation statements)
references
References 11 publications
0
34
0
1
Order By: Relevance
“…Second, we proposed an integrated resampling approach, by using the SMOTE for over-sampling and PSO for under-sampling. Although some hybrid resampling approaches were presented in previous studies (Agrawal et al 2015;Huda et al 2018), integrating SMOTE-based over-sampling and PSO-based under-sampling has not been explored before. Third, by compiling real-world datasets with different imbalance ratios and testing them with eight machine learning methods, the proposed integrated resampling approach was comprehensively evaluated.…”
Section: Conclusion Implications and Future Research Directionsmentioning
confidence: 99%
“…Second, we proposed an integrated resampling approach, by using the SMOTE for over-sampling and PSO for under-sampling. Although some hybrid resampling approaches were presented in previous studies (Agrawal et al 2015;Huda et al 2018), integrating SMOTE-based over-sampling and PSO-based under-sampling has not been explored before. Third, by compiling real-world datasets with different imbalance ratios and testing them with eight machine learning methods, the proposed integrated resampling approach was comprehensively evaluated.…”
Section: Conclusion Implications and Future Research Directionsmentioning
confidence: 99%
“…The number of the method's iterations is set as the number of classes. There is also a limited number of works on combinations of oversampling with undersampling (Agrawal et al, 2015), which include a selective hybrid resampling SPIDER3 (Wojciechowski et al, 2017), where relations between classes are captured by predefined misclassification costs. Moreover, Seaz et al (2016) have applied types of minority examples of Napierala and Stefanowski (2012) to independently oversample single minority classes, however without considering any relations between classes.…”
Section: Related Work On Multiclass Imbalancesmentioning
confidence: 99%
“…In SOUP, all majority classes are undersampled and all minority classes are oversampled to the cardinality being the average of the sizes of the biggest minority and the smallest majority class (line 3). It is partly inspired by experiences with SCUT undersampling (Agrawal et al, 2015). This provides us not only a dataset with a balanced class distribution, but also with a reasonable size.…”
Section: Resampling Algorithm Soupmentioning
confidence: 99%
“…Masalah data kelas tidak seimbang sering disebabkan oleh satu kelas kalah banyak dengan kelas lain didalam dataset [1] [2]. Masalah ini banyak dijumpai diberbagai data pada domain aplikasi seperti deteksi tumpahan minyak [4], pengindraan jarak jauh [5] klasifikasi teks [6], pemodelan respon [7], penilaian kualitas data sensor [8], deteksi kartu kredit palsu [9] dan extraksi pengetahuan dari database [10] sehingga hal ini menjadi penting bagi para peneliti di bidang data mining [11]. Namun dalam maslah ini cukupa sulit karena algoritma klasifikasi tradisional bias terhadap kelas minoritas [12], artinya apabila dipaksakan hasil prediksi dapat mendekati keliru bahkan salah [13].…”
Section: Pendahuluanunclassified