A Comprehensive Empirical Study of Count Models for Software Fault Prediction

Gao, Kehan; Khoshgoftaar, Taghi M.

doi:10.1109/tr.2007.896761

Cited by 63 publications

(25 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…With the training set in place, we choose regressions for model selection. Our three candidate regressions are all generalized linear regressions previously used [12] for predicting failure count data: negative binomial regression, Poisson regression, and the logistic regression. The negative binomial and Poisson regressions estimate the number of failures for a given file, by which we rank for our prioritization.…”

Section: Model Selection and Validationmentioning

confidence: 99%

Predicting failures with developer networks and social network analysis

Meneely

Williams

Snipes

et al. 2008

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering

192

154

View full text Add to dashboard Cite

Software fails and fixing it is expensive. Research in failure prediction has been highly successful at modeling software failures. Few models, however, consider the key cause of failures in software: people. Understanding the structure of developer collaboration could explain a lot about the reliability of the final product. We examine this collaboration structure with the developer network derived from code churn information that can predict failures at the file level. We conducted a case study involving a mature Nortel networking product of over three million lines of code. Failure prediction models were developed using test and post-release failure data from two releases, then validated against a subsequent release. One model's prioritization revealed 58% of the failures in 20% of the files compared with the optimal prioritization that would have found 61% in 20% of the files, indicating that a significant correlation exists between filebased developer network metrics and failures.

show abstract

Section: Model Selection and Validationmentioning

confidence: 99%

Predicting failures with developer networks and social network analysis

Meneely

Williams

Snipes

et al. 2008

Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of Software Engineering

192

154

View full text Add to dashboard Cite

show abstract

“…Moreover, the effectiveness of this method for predicting the number of bugs has been validated by prior studies [22,23,27,28].…”

Section: Lrmentioning

confidence: 94%

“…Because multiple liner regression (MLR) models have proven useful in software defect prediction [9,27,28], in this paper we define a simple MLR model predicting the scores of a given set of software entities, as described below. …”

Section: B Description Of Our Approachmentioning

confidence: 99%

A Ranking-Oriented Approach to Cross-Project Software Defect Prediction: An Empirical Study

You¹,

2016

International Conferences on Software Engineering and Knowledge Engineering

View full text Add to dashboard Cite

Abstract-In recent years, cross-project defect prediction (CPDP) has become very popular in the field of software defect prediction. It was treated as a binary classification or regression problem in most of previous studies. However, the existing methods to solve this problem may be not suitable for those projects with limited manpower and time. In this paper, we revisit the issue and treat it as a ranking problem. Inspired by the idea of the Point-wise approach to Learning to Rank, we propose a ranking-oriented CPDP approach called ROCPDP. The empirical results obtained based on AEEEM show that the defect predictor built with our method under a specific CPDP context, in general, outperforms those predictors trained by using the benchmark methods in both CPDP and WPDP (within-project defect prediction) scenarios in terms of two common evaluation metrics for rank correlation. So, our work could be an initial attempt to construct new rankingoriented CPDP models for newly created or inactive projects.

show abstract

“…Gao and Khoshgoftaar [18] empirically evaluated eight statistical count models for software quality prediction. They showed that with a very large number of zero response variables, the zero inflated and hurdle-count models are more appropriate.…”

Section: Related Workmentioning

confidence: 99%

Using Faults-Slip-Through Metric as a Predictor of Fault-Proneness

Afzal

2010

2010 Asia Pacific Software Engineering Conference

View full text Add to dashboard Cite

A Comprehensive Empirical Study of Count Models for Software Fault Prediction

Cited by 63 publications

References 14 publications

Predicting failures with developer networks and social network analysis

Predicting failures with developer networks and social network analysis

A Ranking-Oriented Approach to Cross-Project Software Defect Prediction: An Empirical Study

Using Faults-Slip-Through Metric as a Predictor of Fault-Proneness

Contact Info

Product

Resources

About