2016
DOI: 10.7717/peerj-cs.49
|View full text |Cite
|
Sign up to set email alerts
|

How are functionally similar code clones syntactically different? An empirical study and a benchmark

Abstract: Background. Today, redundancy in source code, so-called ''clones'' caused by copy &paste can be found reliably using clone detection tools. Redundancy can arise also independently, however, not caused by copy&paste. At present, it is not clear how only functionally similar clones (FSC) differ from clones created by copy&paste. Our aim is to understand and categorise the syntactical differences in FSCs that distinguish them from copy&paste clones in a way that helps clone detection research. Methods. We conduct… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
10
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(10 citation statements)
references
References 24 publications
0
10
0
Order By: Relevance
“…We sampled 10 solutions from 30 different problems for a total of 44,850 4 function pairs. Since these solutions all passed the automated test suite from Google they can be considered as type-4 clones [18], [19]. Out of all possible pairs, 1,350 are clones (solutions which were submitted to the same problem).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We sampled 10 solutions from 30 different problems for a total of 44,850 4 function pairs. Since these solutions all passed the automated test suite from Google they can be considered as type-4 clones [18], [19]. Out of all possible pairs, 1,350 are clones (solutions which were submitted to the same problem).…”
Section: Methodsmentioning
confidence: 99%
“…• Tools built with type-1, and -2 in mind use syntactic [5] or lexical similarities [3] to detect clones. By definition, these methods cannot detect semantic similarities if the syntax used to implement them is different [19]. Their performance for type-4 clones is thus lackluster.…”
Section: Introductionmentioning
confidence: 99%
“…PDG based methods can detect complex Type 3 clones, e.g., Listings 1 and 2. However, the compared PDG sub-graphs are a representation of the source code; thereby, the approaches still rely on syntactic similarity [43].…”
Section: Related Workmentioning
confidence: 99%
“…Finally, Wagner et al found that less than 16 % of FSC pairs have actual syntactic similarities [3]. They provide a benchmark for FSCs which has, to the best of our knowledge, not yet been used to test FSC detection approaches.…”
Section: Related Workmentioning
confidence: 99%
“…The problem with classic approaches based on text, tokens, or syntax is that they cannot find clones with a completely different structure. We include those in the so called Functionally Similar Clones (FSCs) [3]. FSCs have the same or similar functionality but were generally created independently.…”
Section: Introductionmentioning
confidence: 99%