The Impact of Mislabelling on the Performance and Interpretation of Defect Prediction Models

Tantithamthavorn, Chakkrit; McIntosh, Shane; Hassan, Ahmed E.; Ihara, Akinori; Matsumoto, Kenichi

doi:10.1109/icse.2015.93

Cited by 94 publications

(64 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Kim et al [72] find that the randomly-generated noise has a large negative impact on the performance of defect models. On the other hand, our recent work [138] shows that realistic noise (i.e., noise generated by actually mislabelled issue reports [54]) does not typically impact the precision of defect prediction models.…”

Section: The Experimental Design Of Defect Prediction Modelsmentioning

confidence: 97%

See 1 more Smart Citation

The Impact of Automated Parameter Optimization on Defect Prediction Models

Tantithamthavorn

McIntosh

Hassan

et al. 2019

IIEEE Trans. Software Eng.

Self Cite

277

121

View full text Add to dashboard Cite

Abstract-Defect prediction models-classifiers that identify defect-prone software modules-have configurable parameters that control their characteristics (e.g., the number of trees in a random forest). Recent studies show that these classifiers underperform when default settings are used. In this paper, we study the impact of automated parameter optimization on defect prediction models. Through a case study of 18 datasets, we find that automated parameter optimization: (1) improves AUC performance by up to 40 percentage points; (2) yields classifiers that are at least as stable as those trained using default settings; (3) substantially shifts the importance ranking of variables, with as few as 28% of the top-ranked variables in optimized classifiers also being top-ranked in non-optimized classifiers; (4) yields optimized settings for 17 of the 20 most sensitive parameters that transfer among datasets without a statistically significant drop in performance; and (5) adds less than 30 minutes of additional computation to 12 of the 26 studied classification techniques. While widely-used classification techniques like random forest and support vector machines are not optimization-sensitive, traditionally overlooked techniques like C5.0 and neural networks can actually outperform widely-used techniques after optimization is applied. This highlights the importance of exploring the parameter space when using parameter-sensitive classification techniques.

show abstract

Section: The Experimental Design Of Defect Prediction Modelsmentioning

confidence: 97%

“…Jiang et al [62] and Bibi et al [13] use the default value of k for the k-nearest neighbours classification technique (k = 1). In our prior work (e.g., [39,138]), we ourselves have also used default classification settings.…”

Section: Related Work and Research Questionsmentioning

confidence: 99%

The Impact of Automated Parameter Optimization on Defect Prediction Models

Tantithamthavorn

McIntosh

Hassan

et al. 2019

IIEEE Trans. Software Eng.

Self Cite

277

121

View full text Add to dashboard Cite

show abstract

“…The algorithm was applied on each system of the dataset and for each set of predictors considered (i.e., based on structural metrics [30], entropy of changes [5], number of developer [56], scattering metrics [32], [33], and antipattern metrics [27]), and for this reason we had to analyze 34 ranks for each basic model. Therefore, as suggested by previous work [122], [123], [124] we adopted again the Scott-Knott ESD test [90], which in this case had the goal to find statistically significant relevant features composing the models.…”

Section: Rq3 -Gain Provided By the Intensity Indexmentioning

confidence: 99%

Toward a Smell-Aware Bug Prediction Model

Palomba

Zanoni

Fontana

et al. 2019

IIEEE Trans. Software Eng.

View full text Add to dashboard Cite

Code smells are symptoms of poor design and implementation choices. Previous studies empirically assessed the impact of smells on code quality and clearly indicate their negative impact on maintainability, including a higher bug-proneness of components affected by code smells. In this paper, we capture previous findings on bug-proneness to build a specialized bug prediction model for smelly classes. Specifically, we evaluate the contribution of a measure of the severity of code smells (i.e., code smell intensity) by adding it to existing bug prediction models based on both product and process metrics, and comparing the results of the new model against the baseline models. Results indicate that the accuracy of a bug prediction model increases by adding the code smell intensity as predictor. We also compare the results achieved by the proposed model with the ones of an alternative technique which considers metrics about the history of code smells in files, finding that our model works generally better. However, we observed interesting complementarities between the set of buggy and smelly classes correctly classified by the two models. By evaluating the actual information gain provided by the intensity index with respect to the other metrics in the model, we found that the intensity index is a relevant feature for both product and process metrics-based models. At the same time, the metric counting the average number of code smells in previous versions of a class considered by the alternative model is also able to reduce the entropy of the model. On the basis of this result, we devise and evaluate a smell-aware combined bug prediction model that included product, process, and smell-related features.We demonstrate how such model classifies bug-prone code components with an F-Measure at least 13% higher than the existing state-of-the-art models.

show abstract

“…Moreover, AUC and Brier score are robust to the data where the distribution of a dependent variable is skewed (Fawcett, 2006). Nonetheless, we also measure precision, recall, and F-measure which are commonly used in software engineering literature (Elish and Elish, 2008;Foo et al, 2015;Tantithamthavorn et al, 2015;Zimmermann et al, 2005). Below, we describe each of the performance measures:…”

Section: Model Analysis (Ma)mentioning

confidence: 99%

The impact of human factors on the participation decision of reviewers in modern code review

et al. 2018

Self Cite

View full text Add to dashboard Cite

Modern Code Review (MCR) plays a key role in software quality practices. In MCR process, a new patch (i.e., a set of code changes) is encouraged to be examined by reviewers in order to identify weaknesses in source code prior to an integration into main software repositories. To mitigate the risk of having future defects, prior work suggests that MCR should be performed with sufficient review participation. Indeed, recent work shows that a low number of participated reviewers is associated with poor software quality. However, there is a likely case that a new patch still suffers from poor review participation even though reviewers were invited. Hence, in this paper, we set out to investigate the factors that are associated with the participation decision of an invited reviewer. Through a case study of 230,090 patches spread across the Android, LibreOffice, OpenStack and Qt systems, we find that (1) 16%-66% of patches have at least one invited reviewer who did not respond to the review invitation; (2) human factors play an important role in predicting whether or not an invited reviewer will participate in a review; (3) a review participation rate of an invited reviewers and code authoring experience of an invited reviewer are highly associated with the participation decision of an invited reviewer. These results can help practitioners better understand about how human factors associate with the participation decision of reviewers and serve as guidelines for inviting reviewers, leading to a better inviting decision and a better reviewer participation.

show abstract

The Impact of Mislabelling on the Performance and Interpretation of Defect Prediction Models

Cited by 94 publications

References 43 publications

The Impact of Automated Parameter Optimization on Defect Prediction Models

The Impact of Automated Parameter Optimization on Defect Prediction Models

Toward a Smell-Aware Bug Prediction Model

The impact of human factors on the participation decision of reviewers in modern code review

Contact Info

Product

Resources

About