2009
DOI: 10.1016/j.eswa.2008.06.108
|View full text |Cite
|
Sign up to set email alerts
|

Cluster-based under-sampling approaches for imbalanced data distributions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
232
0
6

Year Published

2013
2013
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 587 publications
(262 citation statements)
references
References 10 publications
0
232
0
6
Order By: Relevance
“…The last method ''average farthest" is similar to the ''average nearest" method; it selects the majority class samples which have the farthest average distances from all the minority class samples. These under-sampling approaches based on distance, expend a lot of time selecting the majority class samples in the large dataset, and they are not efficient in real applications [7].…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations
“…The last method ''average farthest" is similar to the ''average nearest" method; it selects the majority class samples which have the farthest average distances from all the minority class samples. These under-sampling approaches based on distance, expend a lot of time selecting the majority class samples in the large dataset, and they are not efficient in real applications [7].…”
Section: Introductionmentioning
confidence: 99%
“…Under-sampling is a technique to reduce the number of samples in the majority class, where the size of the majority class sample is reduced from the original datasets to balance the class distribution. One simple method of under-sampling (random under-sampling) is to select a subset of majority class samples randomly and then combine them with minority class sample as a training set [7]. Many researchers have proposed some advanced way of under-sampling the majority class data.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Here we give a brief review of cluster-based under-sampling methods for imbalanced data, because it shows more related to our work [20][21][22][23]. These algorithms differ on whether the clustering is done on the whole training data or inside each category.…”
Section: Classification Of Imbalanced Datamentioning
confidence: 99%