Crash reproduction approaches help developers during debugging by generating a test case that reproduces a given crash. Several solutions have been proposed to automate this task. However, the proposed solutions have been evaluated on a limited number of projects, making comparison difficult. In this paper, we enhance this line of research by proposing JCrashPack, an extensible benchmark for Java crash reproduction, together with ExRunner, a tool to simply and systematically run evaluations. JCrashPack contains 200 stack traces from various Java projects, including industrial open source ones, on which we run an extensive evaluation of EvoCrash, the state-of-the-art tool for search-based crash reproduction. EvoCrash successfully reproduced 43% of the crashes. Furthermore, we observed that reproducing NullPointerException, IllegalArgumentException, and IllegalStateException is relatively easier than reproducing ClassCastException, ArrayIndexOutOfBoundsException and StringIndexOutOfBoundsException. Our results include a detailed manual analysis of EvoCrash outputs, from which we derive 14 current challenges for crash reproduction, among which the generation of input data and the handling of abstract and anonymous classes are the most frequents. Finally, based on those challenges, we discuss future research directions for search-based crash reproduction for Java.
Summary Search‐based crash reproduction approaches assist developers during debugging by generating a test case, which reproduces a crash given its stack trace. One of the fundamental steps of this approach is creating objects needed to trigger the crash. One way to overcome this limitation is seeding: using information about the application during the search process. With seeding, the existing usages of classes can be used in the search process to produce realistic sequences of method calls, which create the required objects. In this study, we introduce behavioural model seeding: a new seeding method that learns class usages from both the system under test and existing test cases. Learned usages are then synthesized in a behavioural model (state machine). Then, this model serves to guide the evolutionary process. To assess behavioural model seeding, we evaluate it against test seeding (the state‐of‐the‐art technique for seeding realistic objects) and no seeding (without seeding any class usage). For this evaluation, we use a benchmark of 122 hard‐to‐reproduce crashes stemming from six open‐source projects. Our results indicate that behavioural model seeding outperforms both test seeding and no seeding by a minimum of 6% without any notable negative impact on efficiency.
EvoCrash is a recent search-based approach to generate a test case that reproduces reported crashes. The search is guided by a fitness function that uses a weighted sum scalarization to combine three different heuristics: (i) code coverage, (ii) crash coverage and (iii) stack trace similarity. In this study, we propose and investigate two alternatives to the weighted sum scalarization: (i) the simple sum scalarization and (ii) the multi-objectivization, which decomposes the fitness function into several optimization objectives as an attempt to increase test case diversity. We implemented the three alternative optimizations as an extension of EvoSuite, a popular search-based unit test generator, and applied them on 33 real-world crashes. Our results indicate that for complex crashes the weighted sum reduces the test case generation time, compared to the simple sum, while for simpler crashes the effect is the opposite. Similarly, for complex crashes, multi-objectivization reduces test generation time compared to optimizing with the weighted sum; we also observe one crash that can be replicated only by multi-objectivization. Through our manual analysis, we found out that when optimizing the original weighted function gets trapped in local optima, optimization for decomposed objectives improves the search for crash reproduction. Generally, while multi-objectivization is under-explored, our results are promising and encourage further investigations of the approach.Partially funded by the EU Horizon 2020 ICT-10-2016-RIA "STAMP" project (No.731529) and the Dutch 4TU project "Big Software on the Run".
However, these tests are very expensive and are too many to be run frequently within limited time constraints. In this paper, we investigate test case prioritization techniques to increase the ability to detect SDC regression faults with virtual tests earlier. Our empirical study conducted in the SDC domain shows that
Approaches for automatic crash reproduction aim to generate test cases that reproduce crashes starting from the crash stack traces. These tests help developers during their debugging practices. One of the most promising techniques in this research field leverages search-based software testing techniques for generating crash reproducing test cases. In this paper, we introduce Botsing, an opensource search-based crash reproduction framework for Java. Botsing implements state-of-the-art and novel approaches for crash reproduction. The well-documented architecture of Botsing makes it an easy-to-extend framework, and can hence be used for implementing new approaches to improve crash reproduction. We have applied Botsing to a wide range of crashes collected from open source systems. Furthermore, we conducted a qualitative assessment of the crash-reproducing test cases with our industrial partners. In both cases, Botsing could reproduce a notable amount of the given stack traces.Demo. video: https://www.youtube.com/watch?v=k6XaQjHqe48 Botsing website: https://stamp-project.github.io/botsing/ CCS CONCEPTS• Software and its engineering → Software testing and debugging; Search-based software engineering.
Testing with simulation environments help to identify critical failing scenarios emerging autonomous systems such as self-driving cars (SDCs) and are safer than in-field operational tests. However, these tests are very expensive and are too many to be run frequently within limited time constraints. In this paper, we investigate test case prioritization techniques to increase the ability to detect SDC regression faults with virtual tests earlier.Our approach, called SDC-Prioritizer, prioritizes virtual tests for SDCs according to static features of the roads used within the driving scenarios. These features are collected without running the tests and do not require past execution results. SDC-Prioritizer utilizes meta-heuristics to prioritize the test cases using diversity metrics (black-box heuristics) computed on these static features. Our empirical study conducted in the SDC domain shows that SDC-Prioritizer doubles the number of safety-critical failures that virtual tests can detect at the same level of execution time compared to baselines: random and greedy-based test case orderings. Furthermore, this meta-heuristic search performs statistically better than both baselines in terms of detecting safety-critical failures. SDC-Prioritizer effectively prioritize test cases for SDCs with a large improvement in fault detection while its overhead (up to 0.34% of the test execution cost) is negligible.CCS Concepts: • Software and its engineering → Search-based software engineering; Software testing and debugging.
Search-based approaches have been used in the literature to automate the process of creating unit test cases. However, related work has shown that generated unit-tests with high code coverage could be ineffective, i.e., they may not detect all faults or kill all injected mutants. In this paper, we proposed an integration-level test case generator named CLING, that exploits the integration code of a pair of classes (caller and callee) that interact with one another through method calls. In particular, CLING generates integration-level test cases that maximize the Coupled Branches Criterion (CBC). CBC is a novel integration-level coverage criterion that measures how thoroughly a test suite exercises the interactions between callers and callees. We evaluate CLING on 140 pairs of classes from five different open-source Java projects. Our results show that (1) CLING generates test suites with high CBC coverage;(2) such generated suites can kill, on average, 10% mutants that are not killable by unit-level tests generated with EVOSUITE for the same classes; (3) CLING detects 29 integration faults that remain undetected when using automatically generated unit-level test suites.
Evolutionary-based crash reproduction techniques aid developers in their debugging practices by generating a test case that reproduces a crash given its stack trace. In these techniques, the search process is typically guided by a single search objective called Crash Distance. Previous studies have shown that current approaches could only reproduce a limited number of crashes due to a lack of diversity in the population during the search. In this study, we address this issue by applying Multi-Objectivization using Helper-Objectives (MO-HO) on crash reproduction. In particular, we add two helperobjectives to the Crash Distance to improve the diversity of the generated test cases and consequently enhance the guidance of the search process. We assessed MO-HO against the single-objective crash reproduction. Our results show that MO-HO can reproduce two additional crashes that were not previously reproducible by the single-objective approach. CCS CONCEPTS • Software and its engineering → Software testing and debugging; Search-based software engineering.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.