“…As mentioned in Section II, we found 8,829 flaky tests over the total of 19,532 JUnit test methods analyzed: thus, 45% of the tests suffer of flakiness and, therefore, in the first place we can confirm previous findings on the relevance of the phenomenon [18], [12]. Figure 2 depicts a pie chart reporting the distribution of the flaky tests belonging to each of the categories defined by Luo et al [18].…”
Section: A Rq 1 : Causes Behind Test Code Flakinesssupporting
confidence: 75%
“…Unfortunately, test suites are often affected by bugs that can preclude the correct testing of software systems [12], [13]. A typical bug affecting test suites is flakiness [6].…”
Abstract-Regression testing is a core activity that allows developers to ensure that source code changes do not introduce bugs. An important prerequisite then is that test cases are deterministic. However, this is not always the case as some tests suffer from socalled flakiness. Flaky tests have serious consequences, as they can hide real bugs and increase software inspection costs. Existing research has focused on understanding the root causes of test flakiness and devising techniques to automatically fix flaky tests; a key area of investigation being concurrency. In this paper, we investigate the relationship between flaky tests and three previously defined test smells, namely Resource Optimism, Indirect Testing and Test Run War. We have set up a study involving 19,532 JUnit test methods belonging to 18 software systems. A key result of our investigation is that 54% of tests that are flaky contain a test code smell that can cause the flakiness. Moreover, we found that refactoring the test smells not only removed the design flaws, but also fixed all 54% of flaky tests causally co-occurring with test smells.
“…As mentioned in Section II, we found 8,829 flaky tests over the total of 19,532 JUnit test methods analyzed: thus, 45% of the tests suffer of flakiness and, therefore, in the first place we can confirm previous findings on the relevance of the phenomenon [18], [12]. Figure 2 depicts a pie chart reporting the distribution of the flaky tests belonging to each of the categories defined by Luo et al [18].…”
Section: A Rq 1 : Causes Behind Test Code Flakinesssupporting
confidence: 75%
“…Unfortunately, test suites are often affected by bugs that can preclude the correct testing of software systems [12], [13]. A typical bug affecting test suites is flakiness [6].…”
Abstract-Regression testing is a core activity that allows developers to ensure that source code changes do not introduce bugs. An important prerequisite then is that test cases are deterministic. However, this is not always the case as some tests suffer from socalled flakiness. Flaky tests have serious consequences, as they can hide real bugs and increase software inspection costs. Existing research has focused on understanding the root causes of test flakiness and devising techniques to automatically fix flaky tests; a key area of investigation being concurrency. In this paper, we investigate the relationship between flaky tests and three previously defined test smells, namely Resource Optimism, Indirect Testing and Test Run War. We have set up a study involving 19,532 JUnit test methods belonging to 18 software systems. A key result of our investigation is that 54% of tests that are flaky contain a test code smell that can cause the flakiness. Moreover, we found that refactoring the test smells not only removed the design flaws, but also fixed all 54% of flaky tests causally co-occurring with test smells.
“…In our experiments, a similar behaviour was also simulated by means of the return and fail statements. Second, although the use of inheritance in test code is debatable [24] [42] or test smell [43] detection. One of the tasks performed during test refactoring is to reorganize test cases to remove eager and lazy test smells [43]; in this case, our model can help with the refactoring task, since it is not straightforward to manually reorganize test cases in a way that preserves the behaviour of the test suite.…”
As a software system evolves, its test suite can accumulate redundancies over time. Test minimization aims at removing redundant test cases. However, current techniques remove whole test cases from the test suite using test adequacy criteria, such as code coverage. This has two limitations, namely (1) by removing a whole test case the corresponding test assertions are also lost, which can inhibit test suite effectiveness, (2)
CCS CONCEPTS• Software and its engineering → Software testing and debugging;
“…Like any other code, test code too contains faults [4]. Such faults negatively affect the effectiveness, reliability, and usefulness of the automated tests.…”
Automated unit tests are an essential software quality assurance measure that is widely used in practice. In many projects, thus, large volumes of test code have co-evolved with the production code throughout development. Like any other code, test code too may contain faults, affecting the effectiveness, reliability and usefulness of the tests. Furthermore, throughout the software system's ongoing development and maintenance phase, the test code too has to be constantly adapted and maintained. To support detecting problems in test code and improving its quality, we implemented 42 static checks for analyzing JUnit tests. These checks encompass best practices for writing unit tests, common issues observed in using xUnit frameworks, and our experiences collected from several years of providing trainings and reviews of test code for industry and in teaching. The checks can be run using the open source analysis tool PMD. In addition to a description of the implemented checks and their rationale, we demonstrate the applicability of using static analysis for test code by analyzing the unit tests of the open source project JFreeChart.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.