BugsInPy: a database of existing bugs in Python programs to enable controlled testing and debugging studies

Widyasari, Ratnadira; Sim, Sheng Qin; Lok, Camellia; Qi, Haodi; Phan, Jack; Tay, Qijin; Tan, Constance; Wee, Fiona; Tan, Jodie Ethelda; Yieh, Yuheng; Goh, Brian K. P.; Thung, Ferdian; Kang, Hong Jin; Hoang, Thong; Lo, David; Ouh, Eng Lieh

doi:10.1145/3368089.3417943

Cited by 60 publications

(17 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Seeded faults are often used to replicate real fault behavior, especially when the real faults can not be reproduced due to many reasons including technical ones or because they are not available for programs written in certain programming languages. Also, they can be used to solve the issue of unbalanced test suits in real fault datasets such as Defects4J [92] for Java programs, BugsJS [93] for JavaScript programs, and BugsInPy [94] for Python programs, where the passed test cases are much more common than the failed test cases. It is worth mentioning that they are widely used in multiple fault localization studies with about 70.91% of the selected studies utilizing them.…”

Section: Seeded and Real Bugsmentioning

confidence: 99%

A Survey of Challenges in Spectrum-Based Software Fault Localization

Sarhan

Beszedes

2022

IEEE Access

View full text Add to dashboard Cite

In software debugging, fault localization is the most difficult, expensive, tedious, and timeconsuming task, particularly for large-scale software systems. This is due to the fact that it requires significant human participation and it is difficult to automate its sub-tasks. Therefore, there is a high demand for automatic fault localization techniques that can help software engineers effectively find the locations of faults with minimal human intervention. This has led to the proposal of implementing different types of such techniques. However, Spectrum Based Fault Localization (SBFL) is considered amongst the most prominent techniques in this respect due to its efficiency and effectiveness. In SBFL, the probability of each program element (e.g., statement, block, or function) being faulty is calculated based on the results of executing test cases and their corresponding code coverage information. However, SBFL techniques are not yet widely adopted in the industry. The rationale behind this is that they pose a number of issues and their performance is affected by several influential factors. For example, the characteristics of bugs, target programs, test suites, and supporting tools make their effectiveness differ dramatically from one case to another. There are massive studies on SBFL that cover its usage, formulas, performance, etc. So far, no dedicated survey points out comprehensively the issues of SBFL. In this paper, various SBFL challenges and issues have been identified, categorized, and discussed alongside many directions. Also, the paper raises awareness of the works being achieved to address the identified issues and suggests some potential solutions too.INDEX TERMS Program spectra, spectrum based fault localization, software testing, challenges and issues, survey.

show abstract

Section: Seeded and Real Bugsmentioning

confidence: 99%

A Survey of Challenges in Spectrum-Based Software Fault Localization

Sarhan

Beszedes

2022

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Due to the increasing use of Python, datasets of this language have emerged for research. One of the most recent and relevant is BugsInPy Widyasari et al (2020). This dataset is inspired by Defects4J and according to the authors follows a similar structure, including 493 bugs in 17 Python projects.…”

Section: Related Workmentioning

confidence: 99%

A dataset of regressions in web applications detected by end-to-end tests

et al. 2021

View full text Add to dashboard Cite

End-to-end tests present many challenges in the industry. The long-running times of these tests make it unsuitable to apply research work on test case prioritization or test case selection, for instance, on them, as most works on these two problems are based on datasets of unit tests. These ones are fast to run, and time is not usually a considered criterion. This is because there is no dataset of end-to-end tests, due to the infrastructure needs for running this kind of tests, the complexity of the setup and the lack of proper characterization of the faults and their fixes. Therefore, running end-to-end tests for any research work is hard and time-consuming, and the availability of a dataset containing regression bugs, documentation and logs for these tests might foster the usage of end-to-end tests in research works. This paper presents a) a dataset for this kind of tests, including six well-documented manually injected regression bugs and their corresponding fixes in three web applications built using Java and the Spring framework; b) tools for easing the execution of these tests no matter the infrastructure; and c) a comparative study with two well-known datasets of unit tests. The comparative study shows that there are important differences between end-to-end and unit tests, such as their execution time and the amount of resources they consume, which are much higher in the end-to-end tests. End-to-end testing deserves some attention from researchers. Our dataset is a first effort toward easing the usage of end-to-end tests in research works.

show abstract

“…Test code changes are also an important factor in constructing real-world bug benchmarks. Since the availability of benchmarks facilitates software testing, debugging, and automated repairing techniques, changed test cases are identified to guarantee the reproducibility of bugs [66,25,23,35,52].…”

Section: Related Workmentioning

confidence: 99%

Demystifying Regular Expression Bugs: A comprehensive study on regular expression bug causes, fixes, and testing

Wang,

Brown,

Jennings

et al. 2021

Preprint

View full text Add to dashboard Cite

Regular expressions cause string-related bugs and open security vulnerabilities for DOS attacks. However, beyond ReDoS (Regular expression Denial of Service), little is known about the extent to which regular expression issues affect software development and how these issues are addressed in practice. We conduct an empirical study of 356 merged regex-related pull request bugs from Apache, Mozilla, Facebook, and Google GitHub repositories. We identify and classify the nature of the regular expression problems, the fixes, and the related changes in the test code.The most important findings in this paper are as follows: 1) incorrect regular expression behavior is the dominant root cause of regular expression bugs (165/356, 46.3%). The remaining root causes are incorrect API usage (9.3%) and other code issues that require regular expression changes in the fix (29.5%), 2) fixing regular expression bugs is nontrivial as it takes more time and more lines of code to fix them compared to the general pull requests, 3) most (51%) of the regex-related pull requests do not contain test code changes. Certain regex bug types (e.g., compile error, performance issues, regex representation) are less likely to include test code changes than others, and 4) the dominant type of test code changes in regex-related pull requests is test case addition (75%). The results of this study contribute to a broader understanding of the practical problems faced by developers when using, fixing, and testing regular expressions.

show abstract

BugsInPy: a database of existing bugs in Python programs to enable controlled testing and debugging studies

Cited by 60 publications

References 20 publications

A Survey of Challenges in Spectrum-Based Software Fault Localization

A Survey of Challenges in Spectrum-Based Software Fault Localization

A dataset of regressions in web applications detected by end-to-end tests

Demystifying Regular Expression Bugs: A comprehensive study on regular expression bug causes, fixes, and testing

Contact Info

Product

Resources

About