2014 International Computer Science and Engineering Conference (ICSEC) 2014
DOI: 10.1109/icsec.2014.6978197
|View full text |Cite
|
Sign up to set email alerts
|

Under-sampling by algorithm with performance guaranteed for class-imbalance problem

Abstract: Class-imbalance problem is the problem that the number, or data, in the majority class is much more than in the minority class. Traditional classifiers cannot sort out this problem because they focus on the data in the majority class than on the data in the minority class, and then they predict some upcoming data as the data in the majority class. Under-sampling is an efficient way to handle this problem because this method selects the representatives of the data in the majority class. For this reason, under-s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 7 publications
(6 reference statements)
0
3
0
Order By: Relevance
“…In Under-sampling methods, samples from the majority class are discarded until the number of samples in each class are nearly equal while preserving valuable information for learning [25,26]. However, it is inevitable that when under-sampling the dataset, some s amp les t h at are meaningful to the training model may be ignored [27,28]. After all, different under-sampling methods have different filtering principles.…”
Section: ) Data-level Methodsmentioning
confidence: 99%
“…In Under-sampling methods, samples from the majority class are discarded until the number of samples in each class are nearly equal while preserving valuable information for learning [25,26]. However, it is inevitable that when under-sampling the dataset, some s amp les t h at are meaningful to the training model may be ignored [27,28]. After all, different under-sampling methods have different filtering principles.…”
Section: ) Data-level Methodsmentioning
confidence: 99%
“…Fu et al (2016) presented an undersampling method based on Principal Component Analysis (PCA) and weighted comprehensive evaluation to improve the dataset's unbalanced condition during the forecast of software fault data. To overcome the problem with the under-sampling method, Jindaluang et al (2014) proposed a cluster-based under-sampling method which uses a clustering algorithm. This algorithm clusters the data in the majority class and selects a number of representative data in many proportions and then combines them with all the data in the minority class as a training set.…”
Section: Data Level Methodsmentioning
confidence: 99%
“…al. [9] have discussed under-sampling using cluster based approach to balance data. Majority class samples are clustered and prominent samples are selected as the majority samples.…”
Section: A Data Level Approachmentioning
confidence: 99%