Evaluating Defect Prediction Models for a Large Evolving Software System

Mende, Thilo; Koschke, Rainer; Leszak, Marek

doi:10.1109/csmr.2009.55

Cited by 18 publications

(10 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It seems that the trivial model performs better for data sets where the 90th percentile of LoC for fault-free files is lower. This can explain an observation we made in a recent study: We found out that different prediction models consistently performed better when header files where included [20]. Since header files are usually much shorter than their corresponding implementation files, and at least in our case, header files contained fewer defects, the 90th percentile for fault-free files was lower when we included header files.…”

Section: Module-based Evaluation Of a Loc-momsupporting

confidence: 53%

“…They propose to use a variation of lift charts where the x-axis contains the ratio of lines of code instead of modules, and evaluate various data mining algorithms by measuring the performance improvement provided by a classification technique over a random selection of modules. In a recent paper, we have extended this approach to take the characteristics of the data set into account by measuring the deviation from an optimal model [20]. The resulting performance measure is described in Section 3.…”

Section: Related Workmentioning

confidence: 99%

“…It reports how much better than a random model a predictor is, but it does not tell anything about how close the predictor comes to an optimal model. Therefore we extended the idea of CE and proposed a new performance evaluation measure popt by comparing a prediction model with an optimal model [20]. An optimal model would be created as follows: We order all modules by decreasing defect density (and increasing lines of code, in case of ties), and predict the modules with highest defect densities first.…”

Section: Loc-based Evaluation Of a Loc-mommentioning

confidence: 99%

“…In a recent paper, we proposed a performance measure that takes these considerations into account and assesses defect prediction models by comparing them to the optimal performance of an imaginary best classifier [20]. However, this metric has been empirically validated only on confidential industrial data but not on publicly available data sets yet.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Revisiting the evaluation of defect prediction models

Mende

Koschke

2009

Proceedings of the 5th International Conference on Predictor Models in Software Engineering

Self Cite

145

105

View full text Add to dashboard Cite

Defect Prediction Models aim at identifying error-prone parts of a software system as early as possible. Many such models have been proposed, their evaluation, however, is still an open question, as recent publications show.An important aspect often ignored during evaluation is the effort reduction gained by using such models. Models are usually evaluated per module by performance measures used in information retrieval, such as recall, precision, or the area under the ROC curve (AUC). These measures assume that the costs associated with additional quality assurance activities are the same for each module, which is not reasonable in practice. For example, costs for unit testing and code reviews are roughly proportional to the size of a module.In this paper, we investigate this discrepancy using optimal and trivial models. We describe a trivial model that takes only the module size measured in lines of code into account, and compare it to five classification methods. The trivial model performs surprisingly well when evaluated using AUC. However, when an effort-sensitive performance measure is used, it becomes apparent that the trivial model is in fact the worst.

show abstract

Section: Module-based Evaluation Of a Loc-momsupporting

confidence: 53%

Section: Related Workmentioning

confidence: 99%

Section: Loc-based Evaluation Of a Loc-mommentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Revisiting the evaluation of defect prediction models

Mende

Koschke

2009

Proceedings of the 5th International Conference on Predictor Models in Software Engineering

Self Cite

145

105

View full text Add to dashboard Cite

show abstract

“…The evaluation of defect prediction models is a controversial topic, as recent studies show [7], [8], [20]- [22]. One problem is the choice of adequate performance measures, where adequate depends on usage scenarios, criticality, and budget.…”

Section: Related Workmentioning

confidence: 99%

On the Utility of a Defect Prediction Model during HW/SW Integration Testing: A Retrospective Case Study

Mende

Koschke

Peleskã

2011

2011 15th European Conference on Software Maintenance and Reengineering

Self Cite

View full text Add to dashboard Cite

Testing is an important and cost-intensive part of the software development life cycle. Defect prediction models try to identify error-prone components, so that these can be tested earlier or more in-depth, and thus improve the cost-effectiveness during testing. Such models have been researched extensively, but whether and when they are applicable in practice is still debated. The applicability depends on many factors, and we argue that it cannot be analyzed without a specific scenario in mind. In this paper, we therefore present an analysis of the utility for one case study, based on data collected during the hardware/software integration test of a system from the avionic domain. An analysis of all defects found during this phase reveals that more than half of them are not identifiable by a codebased defect prediction model. We then investigate the predictive performance of different prediction models for the remaining defects. The small ratio of defective instances results in relatively poor performance. Our analysis of the cost-effectiveness then shows that the prediction model is not able to outperform simple models, which order files either randomly or by lines of code. Hence, in our setup, the application of defect prediction models does not offer any advantage in practice.

show abstract

An empirical evaluation of defect prediction approaches in within-project and cross-project context

Bhat

Farooq

2023

Software Qual J

View full text Add to dashboard Cite

Evaluating Defect Prediction Models for a Large Evolving Software System

Cited by 18 publications

References 7 publications

Revisiting the evaluation of defect prediction models

Revisiting the evaluation of defect prediction models

On the Utility of a Defect Prediction Model during HW/SW Integration Testing: A Retrospective Case Study

An empirical evaluation of defect prediction approaches in within-project and cross-project context

Contact Info

Product

Resources

About