An analysis of patch plausibility and correctness for generate-and-validate patch generation systems

Qi, Zichao; Long, Fan; Achour, Sara; Rinard, Martin

doi:10.1145/2771783.2771791

Cited by 330 publications

(393 citation statements)

References 47 publications

(91 reference statements)

Supporting

Mentioning

367

Contrasting

Unclassified

Order By: Relevance

“…This gives a nuanced picture of the results, which must however be taken-as usual-with a grain of salt: different tools may focus on achieving a better ranking vs. correctly fixing more bugs, and we do not imply that there is one universal measure of effectiveness. Anyway, our evaluation is widely applicable-including to papers that may not detail this aspect-and is in line with what done in other evaluations [14], [17], [25], [28], [29].…”

Section: E Threats To Validitysupporting

confidence: 73%

“…We quantitatively compare JAID to all other available tools for APR of Java programs that have also used DEFECTS4J in their evaluations: 4 1) jGenProg is the implementation of GenProg [14], [33]-which works on C-for Java programs; we refer to jGenProg's evaluation in [19]; 2) jKali is the implementation of Kali [28]-which works on C-for Java programs; we refer to jKali's evaluation in [19]; 3) Nopol focuses on fixing Java conditional expression; we refer to Nopol's evaluation in [19]; 4) xPAR is a reimplementation of PAR [12]-which is not publicly available-discussed in [13] and [35]; 5) HDA implements the "history-driven" technique of [13]; 6) ACS implements the "precise condition synthesis" of [35].…”

Section: Setupmentioning

confidence: 99%

“…A more detailed analysis [28] of the fixes produced by GenProg and similar techniques has shown that only a small fraction of them is genuinely correct; for example, less than 2% of the bugs of [14] are correctly fixed. [28]'s analysis has pushed the research in APR to addressing this manifestation of the overfitting problem [29].…”

Section: Related Workmentioning

confidence: 99%

“…Since validation is against a finite-often small-number of tests, there is no guarantee that a valid repair is genuinely correct against a complete, and implicit, specification of the method. Indeed, experiments have repeatedly confirmed [19], [28], [29] that automated program repair techniques are prone to producing a significant fraction of valid but incorrect repairs, which merely happen to pass all available tests but are clearly inadequate from a programmer's perspective.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Contract-based program repair without the contracts

Chen

Pei

Furia

2017

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

View full text Add to dashboard Cite

Abstract-Automated program repair (APR) is a promising approach to automatically fixing software bugs. Most APR techniques use tests to drive the repair process; this makes them readily applicable to realistic code bases, but also brings the risk of generating spurious repairs that overfit the available tests. Some techniques addressed the overfitting problem by targeting code using contracts (such as pre-and postconditions), which provide additional information helpful to characterize the states of correct and faulty computations; unfortunately, mainstream programming languages do not normally include contract annotations, which severely limits the applicability of such contract-based techniques.This paper presents JAID, a novel APR technique for Java programs, which is capable of constructing detailed state abstractions-similar to those employed by contract-based techniques-that are derived from regular Java code without any special annotations. Grounding the repair generation and validation processes on rich state abstractions mitigates the overfitting problem, and helps extend APR's applicability: in experiments with the DEFECTS4J benchmark, a prototype implementation of JAID produced genuinely correct repairs, equivalent to those written by programmers, for 25 bugs-improving over the state of the art of comparable Java APR techniques in the number and kinds of correct fixes.

show abstract

Section: E Threats To Validitysupporting

confidence: 73%

Section: Setupmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Contract-based program repair without the contracts

Chen

Pei

Furia

2017

2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE)

View full text Add to dashboard Cite

show abstract

“…The second thing is that we strongly advice researchers evaluate the correctness rate of their automatic repair as well as fixing rate, no matter by human inspection or ground truth comparison in benchmark. Existing repair systems may fail to generate true patch due to test suite overfitting [61], [62], which is a concept in statistics or machine learning. Here overfitting means that the repair helps program perform well within test suite while fail in real usage.…”

Section: Impact Of Test Suite Qualitymentioning

confidence: 99%

A Survey of Test Based Automatic Program Repair

Liu¹,

Long²,

Zhang³

2018

JSW

View full text Add to dashboard Cite

Abstract:Testing and debugging have always been the most time-consuming parts of the software development procedure and require large amounts of human resources. When a bug is located, manually fixing it to repair the buggy program is still a difficult and laborious task for developers. Hence automatic program repair techniques, especially the test-based approaches, have drawn great attentions in recent years. Researchers have explored and proposed various novel methods and tools, pushing the idea closer to reality. In this paper, we systematically survey the work in mainstream of test-based program repair (TBR) and discuss the properties automatically generated patches should have. We classify the state-of-the-art approaches for TBR, and evaluate their strengths and weaknesses according to their functional mechanisms. Finally, we refer to some empirical results and propose four important issues, which are supposed to be critical and constructive in this research area.

show abstract

Discovering common bug‐fix patterns: A large‐scale observational study

Campos

Maia

2019

J Software Evolu Process

View full text Add to dashboard Cite

Background: Automatic program repair aims to reduce costs associated with defect repair.The detection and characterization of common bug-fix patterns in software repositories play an important role in advancing this field. Aim: In this paper, we characterize the occurrence of known bug-fix patterns in Java repositories at an unprecedented large scale. Furthermore, we propose a novel automatic technique for unveiling frequent and isolated repair actions corresponding to realistic bug fixes in Java. Method: The study was conducted for Java GitHub projects organized in two distinct data sets. The first data set (Boa) contains more than 4 million bug-fix commits from 101 471 projects. The second data set (Defects4J) contains 369 real bug fixes from five open-source projects. Results: We characterized the prevalence of the five most common bug-fix patterns (identified in the work of Pan et al) in those bug fixes. The combined results showed direct evidence that developers often forget to add IF preconditions in the code.Conclusion: We discover a total of 155 repair actions from Defects4J patches and discuss 10 pervasive repair actions that occur across all analyzed Java projects. Moreover, the overall Precision and Recall values for the clustering approach were 0.62 and 0.64, respectively. KEYWORDS automated program repair, bug-fix patterns, locality-sensitive hashing

show abstract

An analysis of patch plausibility and correctness for generate-and-validate patch generation systems

Cited by 330 publications

References 47 publications

Contract-based program repair without the contracts

Contract-based program repair without the contracts

A Survey of Test Based Automatic Program Repair

Discovering common bug‐fix patterns: A large‐scale observational study

Contact Info

Product

Resources

About