2022
DOI: 10.1016/j.jksuci.2021.01.014
|View full text |Cite
|
Sign up to set email alerts
|

SMOTE-LOF for noise identification in imbalanced data classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0
2

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 53 publications
(28 citation statements)
references
References 33 publications
0
26
0
2
Order By: Relevance
“…Therefore, the dataset is highly imbalanced, which usually leads to models with poor generalization ability. Both oversampling and undersampling methods have been widely utilized to solve the class imbalance challenge in different ML applications [76]- [78]. However, oversampling methods create balanced training sets by duplicating samples in the minority class, which could result in overfitting [79], while undersampling methods obtain balanced datasets by discarding selected majority class instances.…”
Section: Proposed Ccfd Methods a Data Resamplingmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, the dataset is highly imbalanced, which usually leads to models with poor generalization ability. Both oversampling and undersampling methods have been widely utilized to solve the class imbalance challenge in different ML applications [76]- [78]. However, oversampling methods create balanced training sets by duplicating samples in the minority class, which could result in overfitting [79], while undersampling methods obtain balanced datasets by discarding selected majority class instances.…”
Section: Proposed Ccfd Methods a Data Resamplingmentioning
confidence: 99%
“…Both oversampling and undersampling methods have been widely utilized to solve the class imbalance challenge in different ML applications [76]- [78]. However, oversampling methods create balanced training sets by duplicating samples in the minority class, which could result in overfitting [79], while undersampling methods obtain balanced datasets by discarding selected majority class instances. Hence, undersampling could remove useful examples that might be crucial in building efficient ML models, and it is also inefficient in highly imbalanced datasets like the European credit card dataset.…”
Section: Proposed Ccfd Methods a Data Resamplingmentioning
confidence: 99%
“…The algorithm that works on the first SMOTE differentiated the vectors of the features in the minority class and the nearest neighbor values from the minority class and then multiplied that value by a random number between 0 to 1. Next, the calculation results are added to the feature vector so that the vector value results are obtained from the new one [24].…”
Section: Smotementioning
confidence: 99%
“…The Synthetic Minority Oversampling Technique (SMOTE) is the established geometric approach to balance classes by oversampling the minority class [10]. Multiple variations of SMOTE have been developed [11], including novel approaches such as the SMOTE-LOF which takes into account the Local Outlier Factor [12] to identify noisy synthetic samples. Furthermore, overlap samples from different classes has been reported as a big issue in imbalance problems.…”
Section: Related Workmentioning
confidence: 99%