Random testing is not only a useful testing technique in itself, but also plays a core role in many other testing methods. Hence, any significant improvement to random testing has an impact throughout the software testing community. Recently, Adaptive Random Testing (ART) was proposed as an effective alternative to random testing. This paper presents a synthesis of the most important research results related to ART. In the course of our research and through further reflection, we have realised how the techniques and concepts of ART can be applied in a much broader context, which we present here. We believe such ideas can be applied in a variety of areas of software testing, and even beyond software testing. Amongst these ideas, we particularly note the fundamental role of diversity in test case selection strategies. We hope this paper serves to provoke further discussions and investigations of these ideas.
An important research area of Spectrum-Based Fault Localization (SBFL) is the effectiveness of risk evaluation formulas. Most previous studies have adopted an empirical approach, which can hardly be considered as sufficiently comprehensive because of the huge number of combinations of various factors in SBFL. Though some studies aimed at overcoming the limitations of the empirical approach, none of them has provided a completely satisfactory solution. Therefore, we provide a theoretical investigation on the effectiveness of risk evaluation formulas. We define two types of relations between formulas, namely, equivalent and better. To identify the relations between formulas, we develop an innovative framework for the theoretical investigation. Our framework is based on the concept that the determinant for the effectiveness of a formula is the number of statements with risk values higher than the risk value of the faulty statement. We group all program statements into three disjoint sets with risk values higher than, equal to, and lower than the risk value of the faulty statement, respectively. For different formulas, the sizes of their sets are compared using the notion of subset. We use this framework to identify the maximal formulas which should be the only formulas to be used in SBFL.
Machine Learning algorithms have provided core functionality to many application domains - such as bioinformatics, computational linguistics, etc. However, it is difficult to detect faults in such applications because often there is no “test oracle” to verify the correctness of the computed outputs. To help address the software quality, in this paper we present a technique for testing the implementations of machine learning classification algorithms which support such applications. Our approach is based on the technique “metamorphic testing”, which has been shown to be effective to alleviate the oracle problem. Also presented include a case study on a real-world machine learning application framework, and a discussion of how programmers implementing machine learning algorithms can avoid the common pitfalls discovered in our study. We also conduct mutation analysis and cross-validation, which reveal that our method has high effectiveness in killing mutants, and that observing expected cross-validation result alone is not sufficiently effective to detect faults in a supervised classification program. The effectiveness of metamorphic testing is further confirmed by the detection of real faults in a popular open-source classification program.
There are two fundamental limitations in software testing, known as the reliable test set problem and the oracle problem. Fault-based testing is an attempt by Morell to alleviate the reliable test set problem. In this paper, we propose to enhance fault-based testing to alleviate the oracle problem as well. We present an integrated method that combines metamorphic testing with fault-based testing using real and symbolic inputs.
Restricted Random Testing (RRT) is a new method of testing software that improves upon traditional Random Testing (RT) techniques. Research has indicated that failure patterns (portions of an input domain which, when executed, cause the program to fail or reveal an error) can influence the effectiveness of testing strategies. For certain types of failure patterns, it has been found that a widespread and even distribution of test cases in the input domain can be significantly more effective at detecting failure compared with ordinary RT. Testing methods based on RT, but which aim to achieve even and widespread distributions, have been called Adaptive Random Testing (ART) strategies. One implementation of ART is RRT. RRT uses exclusion zones around executed, but non-failure-causing, test cases to restrict the regions of the input domain from which subsequent test cases may be drawn. In this paper, we introduce the motivation behind RRT, explain the algorithm and detail some empirical analyses carried out to examine the effectiveness of the method. Two versions of RRT are presented: Ordinary RRT (ORRT) and Normalized RRT (NRRT). The two versions share the same fundamental algorithm, but differ in their treatment of non-homogeneous input domains. Investigations into the use of alternative exclusion shapes are outlined, and a simple technique for reducing the computational overheads of RRT, prompted by the alternative exclusion shape investigations, is also explained. The performance of RRT is compared with RT and another ART method based on maximized minimum test case separation (DART), showing excellent improvement over RT and a very favorable comparison with DART.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.