2013
DOI: 10.1186/1471-2105-14-106
|View full text |Cite
|
Sign up to set email alerts
|

SMOTE for high-dimensional class-imbalanced data

Abstract: BackgroundClassification using class-imbalanced data is biased in favor of the majority class. The bias is even larger for high-dimensional data, where the number of variables greatly exceeds the number of samples. The problem can be attenuated by undersampling or oversampling, which produce class-balanced data. Generally undersampling is helpful, while random oversampling is not. Synthetic Minority Oversampling TEchnique (SMOTE) is a very popular oversampling method that was proposed to improve random oversam… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

6
408
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 688 publications
(460 citation statements)
references
References 34 publications
6
408
0
Order By: Relevance
“…The results were ranked based on total cost, and a cost-benefit analysis was performed to see if costs could be reduced. In general, as indicated in the literature, under-sampling seems to work better than over-sampling and SMOTE [22]. The authors recommend the usage of random under-sampling as a solution for class imbalanced datasets because it is also computationally less expensive to implement than SMOTE or over-sampling.…”
Section: Resultsmentioning
confidence: 93%
See 2 more Smart Citations
“…The results were ranked based on total cost, and a cost-benefit analysis was performed to see if costs could be reduced. In general, as indicated in the literature, under-sampling seems to work better than over-sampling and SMOTE [22]. The authors recommend the usage of random under-sampling as a solution for class imbalanced datasets because it is also computationally less expensive to implement than SMOTE or over-sampling.…”
Section: Resultsmentioning
confidence: 93%
“…SMOTE is also computationally expensive to implement when compared to sampling methods like random under-sampling [21]. However, other experiments have proved that simple under-sampling tends to outperform SMOTE in most situations [22]. The performance of classifiers implementing SMOTE has been found to vary based on the number of dimensions in the training dataset [22].…”
Section: Learning From Class Imbalanced Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…This observation needs to be explored in our future work; we also plan to consider sampling methods [5,6,11,15,33,42,102].…”
Section: Figure 68 a Concept Showing Point Correspondence (A) Querymentioning
confidence: 99%
“…Under-sampling removes some instances of the majority class and thus may lead to a loss of information, whereas over-sampling generates artificial samples for the minority class. The various techniques for handling an imbalance are addressed in [5,6,11,15,33,42,102].…”
Section: Future Workmentioning
confidence: 99%