2017 IEEE/ACM 39th International Conference on Software Engineering Companion (ICSE-C) 2017
DOI: 10.1109/icse-c.2017.76
|View full text |Cite
|
Sign up to set email alerts
|

Codeflaws: a programming competition benchmark for evaluating automated program repair tools

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
35
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 52 publications
(36 citation statements)
references
References 19 publications
1
35
0
Order By: Relevance
“…To account for these requirements, we used the Codeflaws benchmark (Tan et al 2017). This benchmark consists of 7,436 programs (among wich 3,902 are faulty) selected from the Codeforces 1 online database of programming contests.…”
Section: Benchmarks: Programs and Fault(s)mentioning
confidence: 99%
“…To account for these requirements, we used the Codeflaws benchmark (Tan et al 2017). This benchmark consists of 7,436 programs (among wich 3,902 are faulty) selected from the Codeforces 1 online database of programming contests.…”
Section: Benchmarks: Programs and Fault(s)mentioning
confidence: 99%
“…To evaluate our approach we used CodeFlaws [5]. The benchmark has 3,902 faulty program versions of 40 defect classes.…”
Section: Resultsmentioning
confidence: 99%
“…This way testers can focus on the most promising mutants and apply mutation on a best-e ort basis. Experimental results using 10-fold cross validation on 1,629 faults, from the CodeFlaws benchmark [5], show a high performance of our approach. In particular our mutant selection method achieves signi cantly better results than random mutant selection by revealing 12% to 20% more faults.…”
Section: Introductionmentioning
confidence: 99%
“…Ensuring that bugs can be reliably reproduced allows datasets to be used for a rich diversity of studies, including testing, fault localisation, and automated program repair, as similar datasets for non-robotic systems (Le Goues et al, 2015;Just et al, 2014;Tan et al, 2017;Sahoo et al, 2010;Do et al, 2005;Henningsson and Wohlin, 2004) have demonstrated in broader contexts. These studies inspire our work to recreate and detect robotics and autonomous systems bugs in simulation, with a view towards detecting new bugs, which is a direction the previous work does not take.…”
Section: Introductionmentioning
confidence: 99%
“…The DEFECTS4J (Just et al, 2014) and MANYBUGS (Le Goues et al, 2015) datasets consist of historical bugs in large-scale Java and C programs, respectively. At the opposite end of the scale, the CODEFLAWS (Tan et al, 2017) and INTROCLASS (Le Goues et al, 2015) datasets are composed of bugs in small, single-file programming assignments (or challenges) completed by novices, using C. The Software Infrastructure Repository (Do et al, 2005) represents the first concerted effort to provide a dataset of reproducible faults. Unlike the aforementioned datasets, the SIR is predominantly composed of artificial bugs, and covers programs written in a variety of different languages.…”
mentioning
confidence: 99%