2020
DOI: 10.1109/tse.2018.2876537
|View full text |Cite
|
Sign up to set email alerts
|

The Impact of Class Rebalancing Techniques on the Performance and Interpretation of Defect Prediction Models

Abstract: Defect prediction models that are trained on class imbalanced datasets (i.e., the proportion of defective and clean modules is not equally represented) are highly susceptible to produce inaccurate prediction models. Prior research compares the impact of class rebalancing techniques on the performance of defect prediction models. Prior research efforts arrive at contradictory conclusions due to the use of different choice of datasets, classification techniques, and performance measures. Such contradictory concl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
142
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 218 publications
(158 citation statements)
references
References 85 publications
3
142
0
Order By: Relevance
“…PatchNet achieves an average AUC of 0.808. Since the new five test sets are highly imbalanced (only 15.79% patches are stable patches), we omit the other metrics (i.e., accuracy, precision, recall, and F1) [49], [53], [60]. We also trained PatchNet on a whole training dataset (i.e., 42,408 stable patches and 39,995 non-stable patches) and evaluated it on 184,481 non-stable patches.…”
Section: Rq2: How Effective Is Patchnet Compared To Other State-of-thmentioning
confidence: 99%
See 1 more Smart Citation
“…PatchNet achieves an average AUC of 0.808. Since the new five test sets are highly imbalanced (only 15.79% patches are stable patches), we omit the other metrics (i.e., accuracy, precision, recall, and F1) [49], [53], [60]. We also trained PatchNet on a whole training dataset (i.e., 42,408 stable patches and 39,995 non-stable patches) and evaluated it on 184,481 non-stable patches.…”
Section: Rq2: How Effective Is Patchnet Compared To Other State-of-thmentioning
confidence: 99%
“…PatchNet achieves an AUC score of 0.805. Again we only use AUC as this dataset is highly imbalanced [49], [53], [60].…”
Section: Rq2: How Effective Is Patchnet Compared To Other State-of-thmentioning
confidence: 99%
“…Different subsets of metrics among training samples may pose a critical threat to validity when analysing and identifying the most important metrics. For example, prior work often applies posthoc multiple comparison analyses (e.g., a Scott-Knott test) on the distributions of importance scores to identify statistical distinct ranks of the most important metrics [34,71,80]. Thus, such post-hoc analyses cannot be applied when feature selection techniques produce different subsets of metrics.…”
Section: Case Study Resultsmentioning
confidence: 99%
“…Finally, RFE provides the subset of metrics which yields the best performance according to an evaluation criterion (e.g., AUC). In our study, we select the AUC measure since it measures the discriminatory power of [23,44,60,71]. We use the implementation of the recursive feature elimination using the rfe function as provided by the caret R package [43].…”
Section: B Wrapper-based Feature Selection Techniquesmentioning
confidence: 99%
See 1 more Smart Citation