Testing in Continuous Integration (CI) involves test case prioritization, selection, and execution at each cycle. Selecting the most promising test cases to detect bugs is hard if there are uncertainties on the impact of committed code changes or, if traceability links between code and tests are not available. This paper introduces R , a new method for automatically learning test case selection and prioritization in CI with the goal to minimize the round-trip time between code commits and developer feedback on failed test cases. The R method uses reinforcement learning to select and prioritize test cases according to their duration, previous last execution and failure history. In a constantly changing environment, where new test cases are created and obsolete test cases are deleted, the R method learns to prioritize error-prone test cases higher under guidance of a reward function and by observing previous CI cycles. By applying R on data extracted from three industrial case studies, we show for the rst time that reinforcement learning enables fruitful automatic adaptive test case selection and prioritization in CI and regression testing.
CCS CONCEPTS•Software and its engineering → Software veri cation and validation; Software testing and debugging;
Regression testing in continuous integration environment is bounded by tight time constraints. To satisfy time constraints and achieve testing goals, test cases must be efficiently ordered in execution. Prioritization techniques are commonly used to order test cases to reflect their importance according to one or more criteria. Reduced time to test or high fault detection rate are such important criteria. In this paper, we present a case study of a test prioritization approach ROCKET (Prioritization for Continuous Regression Testing) to improve the efficiency of continuous regression testing of industrial video conferencing software. ROCKET orders test cases based on historical failure data, test execution time and domain-specific heuristics. It uses a weighted function to compute test priority. The weights are higher if tests uncover regression faults in recent iterations of software testing and reduce time to detection of faults. The results of the study show that the test cases prioritized using ROCKET (1) provide faster fault detection, and (2) increase regression fault detection rate, revealing 30% more faults for 20% of the test suite executed, comparing to manually prioritized test cases.
Test minimization techniques aim at identifying and eliminating redundant test cases from test suites in order to reduce the total number of test cases to execute, thereby improving the efficiency of testing. In the context of software product line, we can save effort and cost in the selection and minimization of test cases for testing a specific product by modeling the product line. However, minimizing the test suite for a product requires addressing two potential issues: 1) the minimized test suite may not cover all test requirements compared with the original suite; 2) the minimized test suite may have less fault revealing capability than the original suite. In this paper, we apply weight-based Genetic Algorithms (GAs) to minimize the test suite for testing a product, while preserving fault detection capability and testing coverage of the original test suite. The challenge behind is to define an appropriate fitness function, which is able to preserve the coverage of complex testing criteria (e.g., Combinatorial Interaction Testing criterion). Based on the defined fitness function, we have empirically evaluated three different weightbased GAs on an industrial case study provided by Cisco Systems, Inc. Norway. We also presented our results of applying the three weight-based GAs on five existing case studies from the literature. Based on these case studies, we conclude that among the three weight-based GAs, Random-Weighted GA (RWGA) achieved significantly better performance than the other ones.
Feature models are commonly used to specify variability in software product lines. Several tools support feature models for variability management at different steps in the development process. However, tool support for test configuration generation is currently limited. This test generation task consists in systematically selecting a set of configurations that represent a relevant sample of the variability space and that can be used to test the product line. In this paper we propose PACOGEN to analyze feature models and automatically generate a set of configurations that cover all pairwise interactions between features. PACOGEN relies on constraint programming to generate configurations that satisfy all constraints imposed by the feature model and to minimize the set of the tests configurations. This work also proposes an extensive experiment, based on the state-of-the art SPLOT feature models repository, showing that PACOGEN scales over variability spaces with millions of configurations and covers pairwise with less configurations than other available tools.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.