Abstract-Many techniques for detecting duplicated source code (software clones) have been proposed in the past. However, it is not yet clear how these techniques compare in terms of recall and precision as well as space and time requirements. This paper presents an experiment that evaluates six clone detectors based on eight large C and Java programs (altogether almost 850 KLOC). Their clone candidates were evaluated by one of the authors as an independent third party. The selected techniques cover the whole spectrum of the state-of-the-art in clone detection. The techniques work on text, lexical and syntactic information, software metrics, and program dependency graphs.
A new method for software safety analysis is presented which uses program slicing and constraint solving to construct and analyze
path conditions
, conditions defined on a program's input variables which must hold for information flow between two points in a program. Path conditions are constructed from subgraphs of a program's dependence graph, specifically, slices and chops. The article describes how constraint solvers can be used to determine if a path condition is satisfiable and, if so, to construct a witness for a safety violation, such as an information flow from a program point at one security level to another program point at a different security level. Such a witness can prove useful in legal matters.The article reviews previous research on path conditions in program dependence graphs; presents new extensions of path conditions for arrays, pointers, abstract data types, and multithreaded programs; presents new decomposition formulae for path conditions; demonstrates how interval analysis and BDDs (binary decision diagrams) can be used to reduce the scalability problem for path conditions; and presents case studies illustrating the use of path conditions in safety analysis. Applying interval analysis and BDDs is shown to overcome the combinatorial explosion that can occur in constructing path conditions. Case studies and empirical data demonstrate the usefulness of path conditions for analyzing practical programs, in particular, how illegal influences on safety-critical programs can be discovered and analyzed.
Current slicing techniques cannot handle systems written in multiple programming languages. Observation-Based Slicing (ORBS) is a language-independent slicing technique capable of slicing multilanguage systems, including systems which contain (third party) binary components. A potential slice obtained through repeated statement deletion is validated by observing the behaviour of the program: if the slice and original program behave the same under the slicing criterion, the deletion is accepted. The resulting slice is similar to a dynamic slice. We evaluate five variants of ORBS on ten programs of different sizes and languages showing that it is less expensive than similar existing techniques. We also evaluate it on bash and four other systems to demonstrate feasible large-scale operation in which a parallelised ORBS needs up to 82% less time when using four threads. The results show that an ORBS slicer is simple to construct, effective at slicing, and able to handle systems written in multiple languages without specialist analysis tools.
Clone detection finds application in many software engineering activities such as comprehension and refactoring. However, the confounding configuration choice problem poses a widely-acknowledged threat to the validity of previous empirical analyses. We introduce desktop and parallelised cloud-deployed versions of a search based solution that finds suitable configurations for empirical studies. We evaluate our approach on 6 widely used clone detection tools applied to the Bellon suite of 8 subject systems. Our evaluation reports the results of 9.3 million total executions of a clone tool; the largest study yet reported. Our approach finds significantly better configurations (p < 0.05) than those currently used, providing evidence that our approach can ameliorate the confounding configuration choice problem.
Static program slicing is an established method for analyzing sequential programs, especially for program understanding, debugging and testing. Until now, there was no slicing method for threaded programs which handles interference correctly. We present such a method which also calculates more precise static slices. This paper extends the well known structures of the control flow graph and the program dependence graph for threaded programs with interference. This new technique does not require serialization of threaded programs.
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.