The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2021
DOI: 10.1007/s00500-021-06096-3
|View full text |Cite
|
Sign up to set email alerts
|

An empirical study toward dealing with noise and class imbalance issues in software defect prediction

Abstract: Noname manuscript No.

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(6 citation statements)
references
References 85 publications
0
6
0
Order By: Relevance
“…Pandey and Tripathi [16] performed an empirical study focused on dealing with noise and class imbalance issues in software defect prediction. They show that if a dataset contains 10% -40% of incorrectly labeled instances the true positive rate (TPR) and true negative rate (TNR) are reduced by 20% -30% and receiver operating characteristic (ROC) values are reduced by 40% -50%.…”
Section: Noise In Sdp Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…Pandey and Tripathi [16] performed an empirical study focused on dealing with noise and class imbalance issues in software defect prediction. They show that if a dataset contains 10% -40% of incorrectly labeled instances the true positive rate (TPR) and true negative rate (TNR) are reduced by 20% -30% and receiver operating characteristic (ROC) values are reduced by 40% -50%.…”
Section: Noise In Sdp Datasetsmentioning
confidence: 99%
“…Kim et al [10], Seiffert et al [15], Pandey and Tripathi [16], Tantithamthavorn et al [17] have all pointed out that noise in the resulting dataset can lead to severe degradation of model performance. While Khan et al [18] showed that noise filters struggle to mitigate the problem, once noise is present in the dataset.…”
Section: Introductionmentioning
confidence: 99%
“…Machine Learning prediction model was utilized for medical datasets for acute organ failure in critical patients (12) . Traditional classification model believes that the misclassification and the SDP model was presented for noise and class imbalance defects to remove the noisy datas which have drawbacks on high error rate in transformation of unbalanced dataset (13) . Classifier ensemble method for high dimensional data classification was portrayed to overcome the baseline models which have minimal drawbacks on predicting the imbalanced datas in an optimized manner (14) .…”
Section: Introductionmentioning
confidence: 99%
“…Extensive experiments on four widely used datasets indicate that ISDA-based solution performs better than eight state-of-the-art methods, covering support vector machine (SVM), Random Forest, random oversampling (ROS), RUS, TSCS (Liu et al, 2014), CDDL (Jing et al, 2014), CEL (Sun et al, 2012), subclass discriminant analysis (SDA) and AdaBoost.NC (S. Wang & Yao, 2013). Pandey and Tripathi (2021) dealt with the impact of noise and class imbalance problem on five defect models by adding the various noise level (080%), which provides guidelines for the possible range of tolerable noise for baseline models. The experimental results on 864 experiments over three public datasets show that Random forest outperforms compared with other state-of-the-art techniques under AUC, which has a high noise tolerance rate (3040%).…”
mentioning
confidence: 99%
“…With regard to evaluation metrics, these studies employed metrics that were discrepant, even ignored the false-positive indicators, although these studies are very meaningful to the software engineering community. Galar et al (2012), Wang et al (2016), and Pandey and Tripathi (2021) hired only one measure, which did not provide a well-established reference for participants. Khoshgoftaar et al (2014) and Diez-Pastor et al (2015) employed comprehensive performance indicators, which makes people confused about basic indicators (e.g., recall and FPR).…”
mentioning
confidence: 99%