Combinatorial interaction testing (CIT) is an effective failure detection method for many types of software systems. This review discusses the current approaches CIT uses in detecting parameter interactions, the difficulties of applying it in practice, recent advances, and opportunities for future research.
The increasing complexity of configurable software systems has created a need for more intelligent sampling mechanisms to detect and characterize failure-inducing dependencies between configurations. Prior work -in idealized environments -has shown that test schedules based on a mathematical object, called a covering array, in combination with classification techniques, can meet this need. Applying this approach in practice, however, is tricky because testing time and resource availability are unpredictable, and because failure characteristics can change from release to release. With current approaches developers must set a key covering array parameter (its strength) based on estimated release times and failure characterizations. This will influence the outcome of their results.In this paper we propose a new approach that incrementally builds covering array schedules. This approach begins at a low strength, and then iteratively increases strength as resources allow. At each stage previously tested configurations are reused, thus avoiding duplication of work. With the incremental approach developers need never commit to a specific covering array strength. Instead, by using progressively stronger covering array schedules, failures due to few configuration dependencies can be found and classified as soon and as cheaply as possibly. Additionally, it eliminates the risks of committing to overly strong test schedules.We evaluate this new approach through a case study on three consecutive releases of MySQL, an open source database. Our results suggest that our approach is as good or better than previous approaches, costing less in most cases, and allowing greater flexibility in environments with unpredictable development constraints.
Abstract-There is an increasing interest in techniques that support analysis and measurement of fielded software systems. These techniques typically deploy numerous instrumented instances of a software system, collect execution data when the instances run in the field, and analyze the remotely collected data to better understand the system's in-the-field behavior. One common need for these techniques is the ability to distinguish execution outcomes (e.g., to collect only data corresponding to some behavior or to determine how often and under which condition a specific behavior occurs). Most current approaches, however, do not perform any kind of classification of remote executions and either focus on easily observable behaviors (e.g., crashes) or assume that outcomes' classifications are externally provided (e.g., by the users). To address the limitations of existing approaches, we have developed three techniques for automatically classifying execution data as belonging to one of several classes. In this paper, we introduce our techniques and apply them to the binary classification of passing and failing behaviors. Our three techniques impose different overheads on program instances and, thus, each is appropriate for different application scenarios. We performed several empirical studies to evaluate and refine our techniques and to investigate the trade-offs among them. Our results show that 1) the first technique can build very accurate models, but requires a complete set of execution data; 2) the second technique produces slightly less accurate models, but needs only a small fraction of the total execution data; and 3) the third technique allows for even further cost reductions by building the models incrementally, but requires some sequential ordering of the software instances' instrumentation.
The increasing complexity of configurable software systems creates a need for more intelligent sampling mechanisms to detect and locate failure-inducing dependencies between configurations. Prior work shows that test schedules based on a mathematical object, called a covering array, can be used to detect and locate failures in combination with a classification tree analysis. This paper addresses limitations of the earlier approach. First, the previous work requires developers to choose the covering array's strength, even though there is no scientific or historical basis for doing so. Second, if a single covering array is insufficient to classify specific failures, the entire process must be rerun from scratch. To address these issues, our new approach incrementally and adaptively builds covering array schedules. It begins with a low strength, and continually increases this as resources allow, or poor classification results require. At each stage, previous tests are reused. This allows failures due to only one or two configurations settings to be found and classified as early as possible, and also limits duplication of work when multiple covering arrays must be used.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.