2011
DOI: 10.1002/spe.1043
|View full text |Cite
|
Sign up to set email alerts
|

Choosing software metrics for defect prediction: an investigation on feature selection techniques

Abstract: The selection of software metrics for building software quality prediction models is a search-based software engineering problem. An exhaustive search for such metrics is usually not feasible due to limited project resources, especially if the number of available metrics is large. Defect prediction models are necessary in aiding project managers for better utilizing valuable project resources for software quality improvement. The efficacy and usefulness of a fault-proneness prediction model is only as good as … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
150
1
3

Year Published

2015
2015
2021
2021

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 237 publications
(156 citation statements)
references
References 33 publications
2
150
1
3
Order By: Relevance
“…So far, they have been widely used to estimate the defect-proneness of software components, and more details of these approaches can refer to the recent surveys [3,4]. On the other hand, considering a large number of software metrics, feature subset selection and dimensionality reduction techniques have also been applied to these new defect prediction methods [22,23], and many empirical studies have demonstrated that they are able to achieve higher accuracy and computing efficiency by removing redundant and irrelevant software metrics [10].…”
Section: Related Workmentioning
confidence: 99%
“…So far, they have been widely used to estimate the defect-proneness of software components, and more details of these approaches can refer to the recent surveys [3,4]. On the other hand, considering a large number of software metrics, feature subset selection and dimensionality reduction techniques have also been applied to these new defect prediction methods [22,23], and many empirical studies have demonstrated that they are able to achieve higher accuracy and computing efficiency by removing redundant and irrelevant software metrics [10].…”
Section: Related Workmentioning
confidence: 99%
“…The eight methods used in this work are Chi-Square (CS), Correlation (Cor), Information Gain (IG), Symmetrical Uncertainty (SU), Fisher Score (FS), Welch T-Statistic (WTS), ReliefF (RF), One Rule (OneR). The reason why we choose these methods is that they are widely used in defect prediction and belong to different feature selection families [28], [30]. CS is a statistic-based method, Cor is a correlation-based method, IG and SU are entropy-based methods, FS and WTS are first order statistics-based methods, RF is a instance-based, OneR is a classifier-based method.…”
Section: A Feature Ranking Methodsmentioning
confidence: 99%
“…Various methods have been DOI reference number: 10.18293/SEKE2017-097 successfully introduced to assist the selection of a feature subset that could benefit the defect prediction process on SDD. Previous studies have shown that diverse feature selection methods yield quite different performance on prediction models for SDD [6], [7], which implies that different methods might be not equivalent, that is, different methods would identify different set of features as relevant. However, to the best of our knowledge, no previous studies proposed a method to investigate the equivalence of different feature selection methods.…”
Section: Introductionmentioning
confidence: 99%
“…Gao et al [19] studied four different filter-based feature selection methods with five different classifiers on a large telecommunication system and found that the Kolmogorov-Smirnov method performed the best. Gao et al [20] presented a comparative investigation to evaluate their proposed hybrid feature selection method, which first uses feature ranking to reduce the search space and then applies feature subset selection. In order to investigate different feature selection methods to classification-based bug prediction, Shivaji et al [21] utilized six feature selection methods to iteratively remove irrelevant features until achieving the best performance of F-measure.…”
Section: B Feature Selection In Defect Predictionmentioning
confidence: 99%