Data Mining for Predictors of Software Quality

Khoshgoftaar, Taghi M.; Allen, Edward B.; Jones, Wendell; Hudepohl, J.P.

doi:10.1142/s0218194099000309

Cited by 39 publications

(10 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They also argued that change data is a better predictor of defects than code metrics in general. Studies by Arisholm and Briand [21] and Khoshgoftaar et al [22] also reported that prior changes are a good predictor of defects in a file. Hassan [3] used the complexity of a code change to predict defects.…”

Section: Using Process Metricsmentioning

confidence: 95%

Understanding the impact of code and process metrics on post-release defects

Shihab

Jiang

Ibrahim

et al. 2010

Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement

View full text Add to dashboard Cite

Research studying the quality of software applications continues to grow rapidly with researchers building regression models that combine a large number of metrics. However, these models are hard to deploy in practice due to the cost associated with collecting all the needed metrics, the complexity of the models and the black box nature of the models. For example, techniques such as PCA merge a large number of metrics into composite metrics that are no longer easy to explain. In this paper, we use a statistical approach recently proposed by Cataldo et al. to create explainable regression models. A case study on the Eclipse open source project shows that only 4 out of the 34 code and process metrics impacts the likelihood of finding a post-release defect. In addition, our approach is able to quantify the impact of these metrics on the likelihood of finding post-release defects. Finally, we demonstrate that our simple models achieve comparable performance over more complex PCA-based models while providing practitioners with intuitive explanations for its predictions.

show abstract

Section: Using Process Metricsmentioning

confidence: 95%

Understanding the impact of code and process metrics on post-release defects

Shihab

Jiang

Ibrahim

et al. 2010

Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement

View full text Add to dashboard Cite

show abstract

“…Component dependencies can be used to compute relevant quality measures of software repositories, for instance to identify particularly fragile components [7,13,15]. It is well known that small-world networks are resilient to random failures but particularly weak in the presence of attacks, due to the existence of highly connected hub nodes [2].…”

Section: Strong Dependenciesmentioning

confidence: 99%

Strong dependencies between software components

Abate

Cosmo²,

Boender³

et al. 2009

2009 3rd International Symposium on Empirical Software Engineering and Measurement

View full text Add to dashboard Cite

“…And, this method is also useful in decreasing the number of bugs made by maintainers who are not aware of implicit coding rules. In addition, this method it very powerful in practical use because it can directly detect the line number of faulty code area with detailed instructions, while previous methods for predicting fault-prone modules using software metrics only say which modules are faulty [4,7].…”

Section: Introductionmentioning

confidence: 99%

A method for detecting faulty code violating implicit coding rules

Matsumura

Monden

Matsumoto

2002

Proceedings of the International Workshop on Principles of Software Evolution

View full text Add to dashboard Cite

In the field of legacy software maintenance, there unexpectedly arise a large number of implicit coding ruleswhich are seldom written down in specification documents or design documentsas soflware becomes more complicated than it used be. Since not all the members in a maintenance team realize each of implicit ceding rules, a maintainer who is not aware of a rule often violates the rule while doing various maintenance activities such as adding new functionality and repairing faults. The problem here is not only such a violation causes injection of a new fault into software but also this violation will be repeated again and again in the future by different maintainers. Indeed, we found that 32.7% of t'au]ts of certain legacy software were due to such violations. This paper proposes a method for detecting code fragments that violate implicit coding rules. In the method, an expert maintainer firstly investigates the causes, situations, and code fragments of each fault described in bug reports; and, identifies implicit coding rules as much as possible. Then, code patterns violating the rules (which we call bug code patterns) are described in a pattern description language. Finally, potential faulty code fragments are extracted by a pattern matching technique.

show abstract

Data Mining for Predictors of Software Quality

Cited by 39 publications

References 28 publications

Understanding the impact of code and process metrics on post-release defects

Understanding the impact of code and process metrics on post-release defects

Strong dependencies between software components

A method for detecting faulty code violating implicit coding rules

Contact Info

Product

Resources

About