2018 International Conference on Current Trends Towards Converging Technologies (ICCTCT) 2018
DOI: 10.1109/icctct.2018.8551020
|View full text |Cite
|
Sign up to set email alerts
|

A Review on Handling Imbalanced Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
25
0
4

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 88 publications
(47 citation statements)
references
References 31 publications
0
25
0
4
Order By: Relevance
“…(1) Predicting death from hospital-acquired infections in trauma patients in the absence of a balanced dataset (C5.0 and CHAID); (2) Predicting death from hospital-acquired infection in the trauma patients using a balanced dataset by sampling methods (reduced data set) (C5.0 and CHAID); (3) Clustering hospital-acquired infections in trauma patients by k-means algorithms; (4) Predicting death from hospital-acquired infections in trauma patients in each cluster (C5.0 and CHAID); (5) Predicting death from hospital-acquired infections in trauma patients with SMOTE-C5.0 and ADASYN-C5.0; (6) Predicting death from hospital-acquired infections in the trauma patients with SMOTE-SVM, ADASYN-SVM, SMOTE-ANN, and ADASYN-ANN. Many previous studies have attempted to handle unbalanced data [12][13][14] by adopting various approaches, such as using the right evaluation metrics, resampling the training set (under-sampling, and over-sampling), using K-fold cross-validation appropriately, ensemble different resampled datasets, resampling different ratios, and clustering the frequent class. However, no best model for these problems has been identified, while this strongly relates to techniques, models, and subjects used [2].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…(1) Predicting death from hospital-acquired infections in trauma patients in the absence of a balanced dataset (C5.0 and CHAID); (2) Predicting death from hospital-acquired infection in the trauma patients using a balanced dataset by sampling methods (reduced data set) (C5.0 and CHAID); (3) Clustering hospital-acquired infections in trauma patients by k-means algorithms; (4) Predicting death from hospital-acquired infections in trauma patients in each cluster (C5.0 and CHAID); (5) Predicting death from hospital-acquired infections in trauma patients with SMOTE-C5.0 and ADASYN-C5.0; (6) Predicting death from hospital-acquired infections in the trauma patients with SMOTE-SVM, ADASYN-SVM, SMOTE-ANN, and ADASYN-ANN. Many previous studies have attempted to handle unbalanced data [12][13][14] by adopting various approaches, such as using the right evaluation metrics, resampling the training set (under-sampling, and over-sampling), using K-fold cross-validation appropriately, ensemble different resampled datasets, resampling different ratios, and clustering the frequent class. However, no best model for these problems has been identified, while this strongly relates to techniques, models, and subjects used [2].…”
Section: Introductionmentioning
confidence: 99%
“…Many previous studies have attempted to handle unbalanced data [ 12 14 ] by adopting various approaches, such as using the right evaluation metrics, resampling the training set (under-sampling, and over-sampling), using K-fold cross-validation appropriately, ensemble different resampled datasets, resampling different ratios, and clustering the frequent class. However, no best model for these problems has been identified, while this strongly relates to techniques, models, and subjects used [ 2 ].…”
Section: Introductionmentioning
confidence: 99%
“…DNN can be discriminatively trained with backpropagation that uses cost derivatives (f '(C)) to calculate the difference between the target output and actual output. Weights are updated using (7).…”
Section: Fig 2 Deep Neural Networkmentioning
confidence: 99%
“…The number of positive interactions is very small when compared to the negative interactions. This imbalance can cause inaccurate classification and prediction models [7]. Therefore, additional handling of imbalanced data is required.…”
Section: Introductionmentioning
confidence: 99%
“…Generally, the classification of imbalanced datasets has significant attention in different fields thanks to its wide real applications (Spelmen and Porkodi 2018;Johnson and Khoshgoftaar 2019;López et al 2013). However, the imbalanced dataset problem has been neglected by most of the existing works about bankruptcy prediction.…”
Section: Introductionmentioning
confidence: 99%