The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study

Yu, Qiao; Jiang, Shujuan; Zhang, Yanmei

doi:10.1587/transinf.2016edp7204

Cited by 34 publications

(32 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To determine the performance stability of prediction models, Co-efficient of Variation (C.V) was applied to the results of the prediction models. C.V which is the percentage ratio of standard deviation (SD) and average (AVE) is used to remove the effect of average difference on the comparison stability [15,40]. The formula for C.V is given as thus:…”

Section: Accuracy =mentioning

confidence: 99%

See 1 more Smart Citation

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

et al. 2019

View full text Add to dashboard Cite

Software Defect Prediction (SDP) models are built using software metrics derived from software systems. The quality of SDP models depends largely on the quality of software metrics (dataset) used to build the SDP models. High dimensionality is one of the data quality problems that affect the performance of SDP models. Feature selection (FS) is a proven method for addressing the dimensionality problem. However, the choice of FS method for SDP is still a problem, as most of the empirical studies on FS methods for SDP produce contradictory and inconsistent quality outcomes. Those FS methods behave differently due to different underlining computational characteristics. This could be due to the choices of search methods used in FS because the impact of FS depends on the choice of search method. It is hence imperative to comparatively analyze the FS methods performance based on different search methods in SDP. In this paper, four filter feature ranking (FFR) and fourteen filter feature subset selection (FSS) methods were evaluated using four different classifiers over five software defect datasets obtained from the National Aeronautics and Space Administration (NASA) repository. The experimental analysis showed that the application of FS improves the predictive performance of classifiers and the performance of FS methods can vary across datasets and classifiers. In the FFR methods, Information Gain demonstrated the greatest improvements in the performance of the prediction models. In FSS methods, Consistency Feature Subset Selection based on Best First Search had the best influence on the prediction models. However, prediction models based on FFR proved to be more stable than those based on FSS methods. Hence, we conclude that FS methods improve the performance of SDP models, and that there is no single best FS method, as their performance varied according to datasets and the choice of the prediction model. However, we recommend the use of FFR methods as the prediction models based on FFR are more stable in terms of predictive performance.

show abstract

Section: Accuracy =mentioning

confidence: 99%

“…SDP can be regarded as a classification task that involves categorizing software modules either as defective or non-defective, based on historical data and software metrics or features [14][15][16]. Software features or metrics reflect the characteristics of software modules.…”

Section: Introductionmentioning

confidence: 99%

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

et al. 2019

View full text Add to dashboard Cite

show abstract

“…As well as comparing the class correlation values and classification performance values, we conduct Wilcoxon matched-pair signed-rank test [50] with 95% confidence interval, a nonparametric test method for two or more related samples, to test whether the difference of class correlation values among five process metrics is significant and whether the classification performance difference of five process metrics is significant. This statistical method has been widely used in SDP [51], [52]. The original assumption is that there is no significant difference among five process metrics when the confidence interval is 95%.…”

Section: Gr(a) = Ig(s|a) Splite(a)mentioning

confidence: 99%

Which Process Metrics Are Significantly Important to Change of Defects in Evolving Projects: An Empirical Study

Jiang

Gong

et al. 2020

IEEE Access

Self Cite

View full text Add to dashboard Cite

Process metrics can reflect the software development process and the code changes which are the main causes of defects. So, recently, the researches have put more emphasis on process metrics in the field of software defect prediction. For evolving projects, it is more meaningful to study whether the software module introduces or eliminates defects or not, not whether the software module is defective or defect-free. However, no such work is available in the literature focusing on the change of defect state. Discovering the factors that influence the change of defect state in the process of software development can help us to understand the causes of software defects and improve the quality of subsequent software versions. Therefore, this paper presents an extensive empirical study on which process metrics are significantly important to change of defects in evolving projects. Five process metrics of 37 versions in 12 software projects are collected. We not only analyze the class correlation values and the classification performance values among five process metrics, but also perform statistical analysis to verify whether the experimental results are of practical value. The experimental results indicate that Number of Distinct Committers plays a significantly important role in the change of defect state, especially for elimination of defects, and Number of Revisions is the second, whereas Degree of Code Modification is the last. In addition, Average Number of Modified Lines is superior to Number of Modified Lines. Based on the experimental results, some suggestions for software development and software defect prediction are also discussed.

show abstract

“…The data pre-processing is valuable to enhance the classification performance and decrease the time cost [44]- [46], which includes feature reduction and resampling techniques. Feature reduction is used to increase the generalization performance of classification [15], [47]- [53] by removing the irrelevant features from the balanced and imbalanced datasets. However, all these methods are focused on binary imbalance problem.…”

Section: Related Workmentioning

confidence: 99%

Semi-Supervised Deep Fuzzy C-Mean Clustering for Imbalanced Multi-Class Classification

2019

View full text Add to dashboard Cite

Semi-supervised learning has been successfully connected in the research fields of machine learning such as data mining and dynamic data analysis. Imbalance class learning is one of the most challenging issues for classification. In recent years, the core focal point of numerous researchers has been on data classification of multi-class imbalanced datasets. In this paper, we proposed semi-supervised deep Fuzzy C-mean clustering for imbalanced multi-class classification (DFCM-MC). In our paper, the word ''Deep'' is used to show how decomposition strategy is applied deeply, first, decomposes the original semi-supervised data into supervised (labeled) and unsupervised (unlabeled) data. For training the model, we used unlabeled data along with labeled data to extract discriminative information, which is useful for classification. Second, it further decomposes the supervised and unsupervised data into multi intra-cluster that to address the problem of multi-class imbalance data, which tends to maximize intra-cluster classes and intra-cluster features. We propose a novel approach DFCM-MC by utilizing multi-intra clusters to extract new features to control redundancy for multi-class imbalance classification, which associates the maximum similarity of features between multi-intra clusters. Furthermore, we improve the classification performance of the DFCM-MC, apply the re-sampling technique to handle the imbalance data for classification. We conduct our experiments on 18 benchmark multi-class imbalanced datasets to demonstrate the performance of our proposed approach with the four state-of-the-art learning algorithms for multi-class imbalance data with three performance measures (mean of accuracy, mean of f-measure, and mean of area under the curve). The experiment results demonstrate that our proposed approach performs better due to their capacity to recognize and consolidate fundamental information from unsupervised data. INDEX TERMS Semi-supervised learning, imbalanced data, multi-class classification, Fuzzy C-mean clustering, and feature learning.

show abstract

The Performance Stability of Defect Prediction Models with Class Imbalance: An Empirical Study

Cited by 34 publications

References 37 publications

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach

Which Process Metrics Are Significantly Important to Change of Defects in Evolving Projects: An Empirical Study

Semi-Supervised Deep Fuzzy C-Mean Clustering for Imbalanced Multi-Class Classification

Contact Info

Product

Resources

About