Regression testing is a crucial part of software development. It checks that software changes do not break existing functionality. An important assumption of regression testing is that test outcomes are deterministic: an unmodified test is expected to either always pass or always fail for the same code under test. Unfortunately, in practice, some testsoften called flaky tests-have non-deterministic outcomes. Such tests undermine the regression testing as they make it difficult to rely on test results.We present the first extensive study of flaky tests. We study in detail a total of 201 commits that likely fix flaky tests in 51 open-source projects. We classify the most common root causes of flaky tests, identify approaches that could manifest flaky behavior, and describe common strategies that developers use to fix flaky tests. We believe that our insights and implications can help guide future research on the important topic of (avoiding) flaky tests.
No abstract
Multithreaded code is notoriously hard to develop and test. A multithreaded test exercises the code under test with two or more threads. Each test execution follows some schedule/interleaving of the multiple threads, and different schedules can give different results. Developers often want to enforce a particular schedule for test execution, and to do so, they use time delays (Thread.sleep in Java). Unfortunately, this approach can produce false positives or negatives, and can result in unnecessarily long testing time.This paper presents IMUnit, a novel approach to specifying and executing schedules for multithreaded tests. We introduce a new language that allows explicit specification of schedules as orderings on events encountered during test execution. We present a tool that automatically instruments the code to control test execution to follow the specified schedule, and a tool that helps developers migrate their legacy, sleep-based tests into event-based tests in IMUnit. The migration tool uses novel techniques for inferring events and schedules from the executions of sleep-based tests. We describe our experience in migrating over 200 tests. The inference techniques have high precision and recall of over 75%, and IMUnit reduces testing time compared to sleepbased tests on average 3.39x.
Successful software evolves as developers add more features, respond to requirements changes, and fix faults. Regression testing is widely used for ensuring the validity of evolving software. As regression test suites grow over time, it becomes expensive to execute them. The problem is exacerbated when test suites contain multithreaded tests. These tests are generally long running as they explore many different thread schedules searching for concurrency faults such as dataraces, atomicity violations, and deadlocks. While many techniques have been proposed for regression test prioritization, selection, and minimization for sequential tests, there is not much work for multithreaded code.We present a novel technique, called Change-Aware Preemption Prioritization (CAPP), that uses information about the changes in software evolution to prioritize the exploration of schedules in a multithreaded regression test. We have implemented CAPP in two frameworks for systematic exploration of multithreaded Java code. We evaluated CAPP on the detection of 15 faults in multithreaded Java programs, including large open-source programs. The results show that CAPP can substantially reduce the exploration required to detect multithreaded regression faults.
To improve the overall user experience, graphical user interfaces (GUIs) of successful software systems evolve continuously. While the evolution is beneficial for end users, it creates several problems for developers and testers. Developers need to manually change the GUI code. Testers need to manually inspect and repair highly fragile test scripts. This is time-consuming and error-prone.The state-of-the-art tools for automatic GUI test repair use a black-box approach: they try to infer the changes between two GUI versions and then apply these changes to the test scripts. However, inferring these changes is challenging.We propose a white-box approach where the GUI changes are automated and knowledge about them is reused to repair the test cases appropriately. We use GUI refactorings as a means to encode the evolution of the GUIs. We envision a smart IDE that will record these refactorings precisely as they happen and will use them to change the GUI code and to repair test cases. We illustrate our approach through an example, discuss challenges that should be overcome to turn our vision into reality, and present a research agenda to address these challenges.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.