2009 13th European Conference on Software Maintenance and Reengineering 2009
DOI: 10.1109/csmr.2009.55
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Defect Prediction Models for a Large Evolving Software System

Abstract: A plethora of defect prediction models has been proposed and empirically evaluated, often using standard classification performance measures.In this paper, we explore defect prediction models for a large, multi-release software system from the telecommunications domain. A history of roughly 3 years is analyzed to extract process and static code metrics that are used to build several defect prediction models with Random Forests.The performance of the resulting models is comparable to previously published work. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
9
0

Year Published

2009
2009
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 18 publications
(10 citation statements)
references
References 7 publications
1
9
0
Order By: Relevance
“…It seems that the trivial model performs better for data sets where the 90th percentile of LoC for fault-free files is lower. This can explain an observation we made in a recent study: We found out that different prediction models consistently performed better when header files where included [20]. Since header files are usually much shorter than their corresponding implementation files, and at least in our case, header files contained fewer defects, the 90th percentile for fault-free files was lower when we included header files.…”
Section: Module-based Evaluation Of a Loc-momsupporting
confidence: 53%
See 3 more Smart Citations
“…It seems that the trivial model performs better for data sets where the 90th percentile of LoC for fault-free files is lower. This can explain an observation we made in a recent study: We found out that different prediction models consistently performed better when header files where included [20]. Since header files are usually much shorter than their corresponding implementation files, and at least in our case, header files contained fewer defects, the 90th percentile for fault-free files was lower when we included header files.…”
Section: Module-based Evaluation Of a Loc-momsupporting
confidence: 53%
“…They propose to use a variation of lift charts where the x-axis contains the ratio of lines of code instead of modules, and evaluate various data mining algorithms by measuring the performance improvement provided by a classification technique over a random selection of modules. In a recent paper, we have extended this approach to take the characteristics of the data set into account by measuring the deviation from an optimal model [20]. The resulting performance measure is described in Section 3.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The evaluation of defect prediction models is a controversial topic, as recent studies show [7], [8], [20]- [22]. One problem is the choice of adequate performance measures, where adequate depends on usage scenarios, criticality, and budget.…”
Section: Related Workmentioning
confidence: 99%