2018
DOI: 10.1007/s10664-018-9619-4
|View full text |Cite
|
Sign up to set email alerts
|

Alleviating patch overfitting with automatic test generation: a study of feasibility and effectiveness for the Nopol repair system

Abstract: Among the many different kinds of program repair techniques, one widely studied family of techniques is called test suite based repair. However, test suites are in essence input-output specifications and are thus typically inadequate for completely specifying the expected behavior of the program under repair. Consequently, the patches generated by test suite based repair techniques can just overfit to the used test suite, and fail to generalize to other tests. We deeply analyze the overfitting problem in progr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
47
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3
2
2

Relationship

4
3

Authors

Journals

citations
Cited by 56 publications
(50 citation statements)
references
References 64 publications
3
47
0
Order By: Relevance
“…To sum up, our contributions are: • A new version of QuixBugs that is usable for automatic repair research on Java programs, together with extensive data about the characteristics of QuixBugs. • The confirmation of 2 empirical facts of program repair, improving their external validity: 1) the state-of-the-art program repair tools produce overfitting patches, this confirms the results of [35], [31], [15]; 2) the state-ofthe-art program repair tools also produce correct patches [29], [20]; 3) automatically generated tests can help to assess the correctness of patches in scientific studies, this confirms the results of [40], [36], [41]. • Three new and important findings about program repair: 1) the state-of-the-art program repair tools are able to repair programs with only failing test cases and no passing tests at all; 2) it is useful to design program specific test generators to discard incorrect patches; and 3) a small number of automatically generated test cases is enough to identify incorrect patches in scientific studies.…”
Section: Introductionsupporting
confidence: 56%
See 3 more Smart Citations
“…To sum up, our contributions are: • A new version of QuixBugs that is usable for automatic repair research on Java programs, together with extensive data about the characteristics of QuixBugs. • The confirmation of 2 empirical facts of program repair, improving their external validity: 1) the state-of-the-art program repair tools produce overfitting patches, this confirms the results of [35], [31], [15]; 2) the state-ofthe-art program repair tools also produce correct patches [29], [20]; 3) automatically generated tests can help to assess the correctness of patches in scientific studies, this confirms the results of [40], [36], [41]. • Three new and important findings about program repair: 1) the state-of-the-art program repair tools are able to repair programs with only failing test cases and no passing tests at all; 2) it is useful to design program specific test generators to discard incorrect patches; and 3) a small number of automatically generated test cases is enough to identify incorrect patches in scientific studies.…”
Section: Introductionsupporting
confidence: 56%
“…In our experiment, we consider three techniques for patch correctness assessment: a) using automatically generated tests by a search-based approach based on a reference version [41]; b) using automatically generated tests by a program specific generator based on a reference version [2]; and c) manual analysis of patch correctness [20]. a) Search-based Test Generation Technique: Using automated test generation is one way for assessing patch correctness [37], [36], [41], [40]. In our study, the search-based test generator technique takes as input a reference version of buggy program.…”
Section: B Methodologymentioning
confidence: 99%
See 2 more Smart Citations
“…This study focuses on test-suite adequate patches, which means that the generated patches make the test suite pass; yet, there is no guarantee that they fix the bugs. Studying patch correctness [19,44,49] is out of the scope of this work. Our goal is to analyze the current state of the automatic program repair tools and identify potential flaws and improvements.…”
Section: Discussionmentioning
confidence: 99%