2020
DOI: 10.12962/j24775401.v6i1.6643
|View full text |Cite
|
Sign up to set email alerts
|

Handling Imbalance Data in Classification Model with Nominal Predictors

Abstract: Decision tree, one of classification method, can be done to find out the factors that predict something with interpretable result. However, a small and unbalanced percentage will make the classification only lead to the majority class. Therefore, handling imbalance class needs to be done. One method that often used in nominal predictor data is SMOTE-N. For accuracy improving, a hybrid SMOTE-N and ADASYN-N was developed. SMOTE-N-ENN and ADASYN-N were developed for accuracy improvement. In this study, SMOTE-N, S… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0
3

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 5 publications
0
4
0
3
Order By: Relevance
“…In this study, the data mining process involves the process of balancing data and sharing training data and test data. Data balancing is a process to equalize the amount of data in each class to improve the accuracy of the system during the learning process [19]. In this stage, learning the classification model was carried out, namely grouping TB data into pulmonary and extrapulmonary class categories.…”
Section: Data Mining Processmentioning
confidence: 99%
See 2 more Smart Citations
“…In this study, the data mining process involves the process of balancing data and sharing training data and test data. Data balancing is a process to equalize the amount of data in each class to improve the accuracy of the system during the learning process [19]. In this stage, learning the classification model was carried out, namely grouping TB data into pulmonary and extrapulmonary class categories.…”
Section: Data Mining Processmentioning
confidence: 99%
“…where, đť‘‹đť‘–=vector of features in the minority class đť‘‹knn=k-nearest neighbors for đť‘‹đť‘– 𝛿=random number between 0 to 1 After balancing the data, the next step is to divide the data set into k to n partitions.Splitting this data is known as K-Fold Cross Validation and is a popular method of solving statistical data where the data is divided into two subsets, namely training data for the learning process and test data for validation or assessment used to assess performance models, methods, or algorithms [19]. K-Fold cross validation can be selected based on dataset size.…”
Section: Data Mining Processmentioning
confidence: 99%
See 1 more Smart Citation
“…Pada tahap preprocessing data juga terdapat transformasi data yang meliputi generalisasi data, smoothing, normalisasi dan konstruksi atribut. Dan juga perlu dilakukan penanganan imbalanced data, karena imbalanced class akan menyebabkan akurasi menjadi tidak akurat [7]. Akurasi algoritma dapat ditingkatkan setelah data preprocessing [8].…”
Section: Pendahuluanunclassified
“…Fithrasari et al on handling imbalance data in classification model with nominal predictors in 2020, studied handling imbalanced data in classification models with nominal predictors [16]. They used Survei Kinerja dan Akuntabilitas Kependudukan Keluarga Berencana dan Pembangunan Keluarga (SKAP KKBPK) data Jawa Timur Province in 2018.…”
mentioning
confidence: 99%