2005
DOI: 10.1109/tse.2005.58
|View full text |Cite
|
Sign up to set email alerts
|

Reliability and validity in comparative studies of software prediction models

Abstract: Abstract-Empirical studies on software prediction models do not converge with respect to the question "which prediction model is best?" The reason for this lack of convergence is poorly understood. In this simulation study, we have examined a frequently used research procedure comprising three main ingredients: a single data sample, an accuracy indicator, and cross validation. Typically, these empirical studies compare a machine learning model with a regression model. In our study, we use simulation and compar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
167
2
3

Year Published

2006
2006
2017
2017

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 207 publications
(174 citation statements)
references
References 37 publications
2
167
2
3
Order By: Relevance
“…First, no single prediction technique dominates [3] and, second, making sense of the many prediction results is hampered by the use of different data sets, data pre-processing, validation schemes and performance statistics [4], [3], [5], [6]. These differences are compounded by the lack of any agreed reporting protocols or even the need to share code and algorithms [7].…”
Section: Introductionmentioning
confidence: 99%
“…First, no single prediction technique dominates [3] and, second, making sense of the many prediction results is hampered by the use of different data sets, data pre-processing, validation schemes and performance statistics [4], [3], [5], [6]. These differences are compounded by the lack of any agreed reporting protocols or even the need to share code and algorithms [7].…”
Section: Introductionmentioning
confidence: 99%
“…However, there are still large discrepancies regarding the assessment of the goodness of the different techniques and the reasons for such discrepancies [44,60,39]. For example, Lessmann et al [33] compare 22 classifiers grouped into statistical, nearest neighbour methods, neural networks, support vector machine, decision trees and ensemble methods over ten datasets from the NASA repository.…”
Section: Related Workmentioning
confidence: 99%
“…The result can be multiplied by 100 to get the percentage of deviation from the actual value. The M M RE is the mean of the M RE, it is one of the most widely used criterion for assessing the performance of software prediction models [30,31]. Table 8 shows the values of MRE values in the data set.…”
Section: Model Validationmentioning
confidence: 99%