Comparison and Evaluation of Clone Detection Tools

Bellon, Stefan; Koschke, Rainer; Antoniol, Giuliano; Krinke, Jens; Merlo, Ettore

doi:10.1109/tse.2007.70725

Cited by 640 publications

(591 citation statements)

References 30 publications

(50 reference statements)

Supporting

Mentioning

568

Contrasting

Unclassified

Order By: Relevance

“…Although there is a large number of clone detectors, plagiarism detectors, and code similarity detectors invented in the research community, there are relatively few studies that compare and evaluate their performances. Bellon et al (2007) proposed a framework for comparing and evaluating 6 clone detectors, evaluated a large set of clone detection tools but only based on results obtained from the tools' published papers, Hage et al (2010) compare five plagiarism detectors against 17 code modifications, Burd and Bailey (2002) compare five clone detectors for preventive maintenance tasks, Biegel et al (2011) compare three code similarity measures to identify code that needs refactoring, Svajlenko and Roy (2016) developed and used a clone evaluation framework called BigCloneEval to evaluate 10 state-of-the-art clone detectors. Although these studies cover various goals of tool evaluation and cover the different types of code modification found in the chosen data sets, they suffer from two limitations: (1) the selected tools are limited to only a subset of clone or plagiarism detectors, and (2) the results are based on different data sets, so one cannot compare a tool's performance from one study to another tool's from another study.…”

Section: Motivationmentioning

confidence: 99%

See 1 more Smart Citation

A comparison of code similarity analysers

2017

Self Cite

View full text Add to dashboard Cite

Copying and pasting of source code is a common activity in software engineering. Often, the code is not copied as it is and it may be modified for various purposes; e.g. refactoring, bug fixing, or even software plagiarism. These code modifications could affect the performance of code similarity analysers including code clone and plagiarism detectors to some certain degree. We are interested in two types of code modification in this study: pervasive modifications, i.e. transformations that may have a global effect, and local modifications, i.e. code changes that are contained in a single method or code block. We evaluate 30 code similarity detection techniques and tools using five experimental scenarios for Java source code. These are (1) pervasively modified code, created with tools for source code and bytecode obfuscation, and boiler-plate code, (2) source code normalisation through compilation and decompilation using different decompilers, (3) reuse of optimal configurations over different data sets, (4) tool evaluation using ranked-based measures, and (5) local + global code modifications. Our experimental results show that in the presence of pervasive modifications, some of the general textual similarity measures can offer similar performance to specialised code similarity tools, whilst in the presence of boiler-plate code, highly specialised source code similarity detection techniques and tools outperform textual similarity measures. Our study strongly validates the use of compilation/decompilation as a normalisation technique. Its use reduced false classifications to zero for three of the tools. Moreover, we demonstrate that optimal configurations are very sensitive to a specific data set. After directly applying optimal configurations derived from one data set to another, the tools perform poorly on the new data set. The code similarity analysers are thoroughly evaluated not only based on several well-known pair-based and query-based error measures but also on each specific type of pervasive code modification. This broad, thorough study is the largest in existence and potentially an invaluable guide for future users of similarity detection in source code.

show abstract

Section: Motivationmentioning

confidence: 99%

“…Examples of code similarity analysers using graph-based approaches are the ones invented by Krinke (2001), Komondoor and Horwitz (2001), Chae et al (2013) and Chen et al (2014). Although the tools demonstrate high precision and recall (Krinke 2001), they suffer scalability issues (Bellon et al 2007).…”

Section: Code Similarity Measurementmentioning

confidence: 99%

A comparison of code similarity analysers

2017

Self Cite

View full text Add to dashboard Cite

show abstract

“…With this definition, we are based on the general definition of a code clone: ''two code fragments form a clone pair if they are similar enough according to a given definition of similarity'' (Bellon et al, 2007). Intuitively, what we are interested in is similar enough so that the clones are interesting for a developer of the system while changing it.…”

Section: Terminologymentioning

confidence: 99%

“…Juergens, Deissenboeck & Hummel (2010b) reported on an experiment to investigate the differences between syntactical/representational and semantic/behavioural similarities of Type-1 clone Similar code fragments except for variation in whitespace, layout and comments (Bellon et al, 2007) Type-2 clone Similar code fragments except for variation in identifiers, literals, types, whitespaces layouts and comments (Bellon et al, 2007) Type-3 clone Similar code fragments except that some statements may be added or deleted in addition to variation in identifiers, literals, types, whitespaces, layouts or comments (Bellon et al, 2007) Type-4 clone Two or more code fragments that perform the same computation but are implemented by different syntactic variants. (Roy, Cordy & Koschke, 2009) Functionally similar clone (FSC) Code fragments that provide a similar functionality w.r.t a given definition of similarity but can be implemented quite differently…”

Section: Related Workmentioning

confidence: 99%

How are functionally similar code clones syntactically different? An empirical study and a benchmark

Wagner

Abdulkhaleq

Bogicevic

et al. 2016

PeerJ Computer Science

View full text Add to dashboard Cite

Background. Today, redundancy in source code, so-called ''clones'' caused by copy &paste can be found reliably using clone detection tools. Redundancy can arise also independently, however, not caused by copy&paste. At present, it is not clear how only functionally similar clones (FSC) differ from clones created by copy&paste. Our aim is to understand and categorise the syntactical differences in FSCs that distinguish them from copy&paste clones in a way that helps clone detection research. Methods. We conducted an experiment using known functionally similar programs in Java and C from coding contests. We analysed syntactic similarity with traditional detection tools and explored whether concolic clone detection can go beyond syntax. We ran all tools on 2,800 programs and manually categorised the differences in a random sample of 70 program pairs. Results. We found no FSCs where complete files were syntactically similar. We could detect a syntactic similarity in a part of the files in <16% of the program pairs. Concolic detection found 1 of the FSCs. The differences between program pairs were in the categories algorithm, data structure, OO design, I/O and libraries. We selected 58 pairs for an openly accessible benchmark representing these categories. Discussion. The majority of differences between functionally similar clones are beyond the capabilities of current clone detection approaches. Yet, our benchmark can help to drive further clone detection research.

show abstract

“…Software metrics or code metrics are used to detect clones [22,28,23]. AST is considered to be more accurate than token-based comparison [3]. Baxter et al [2] detect code clones using abstract syntax tree.…”

Section: Related Workmentioning

confidence: 99%