2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) 2019
DOI: 10.1109/ecace.2019.8679382
|View full text |Cite
|
Sign up to set email alerts
|

Revisiting the Class Imbalance Issue in Software Defect Prediction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…Data resampling techniques were used to tackle data imbalance problems in the data sets. These sampling techniques are widely used in machine learning–based prediction models in different areas [ 24 ]. Our first analysis was done without the data resampling technique, where the four machine learning algorithms were applied directly to the data sets.…”
Section: Resultsmentioning
confidence: 99%
“…Data resampling techniques were used to tackle data imbalance problems in the data sets. These sampling techniques are widely used in machine learning–based prediction models in different areas [ 24 ]. Our first analysis was done without the data resampling technique, where the four machine learning algorithms were applied directly to the data sets.…”
Section: Resultsmentioning
confidence: 99%
“…They evaluated twenty seven data sets, using seven classifiers on seven types of input metrics and various imbalanced learning methods and concluded that imbalanced learning could be considered only for moderate or highly imbalanced software defect prediction datasets. Sohan et al [37] conducted a study to know the inconsistency in the performance among imbalanced dataset and balanced dataset. In this study, eight public data sets were examined with seven classification methods to conclude that the imbalance nature of defective and non-defective classes plays a major role in SDP and among seven classifiers, the voting results in best performer among the classifiers.…”
Section: Related Workmentioning
confidence: 99%
“…[8] [9]. Inappropriately [10] [11], Uneven data distribution presents a significant difficulty for the SDP procedure, lowering the quality of the learning model as a result. Due to the asymmetry of the situation, there are fewer malfunctioning modules than there are functional ones.…”
Section: Introductionmentioning
confidence: 99%