Proceedings of the the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the F 2007
DOI: 10.1145/1287624.1287698
|View full text |Cite
|
Sign up to set email alerts
|

Efficient token based clone detection with flexible tokenization

Abstract: Code clones are similar code fragments that occur at multiple locations in a software system. Detection of code clones provides useful information for maintenance, reengineering, program understanding and reuse. Several techniques have been proposed to detect code clones. These techniques differ in the code representation used for analysis of clones, ranging from plain text to parse trees and program dependence graphs. Clone detection based on lexical tokens involves minimal code transformation and gives good … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
45
0
1

Year Published

2009
2009
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 57 publications
(47 citation statements)
references
References 23 publications
(19 reference statements)
1
45
0
1
Order By: Relevance
“…This vagueness is reflected in the definitions " [c]lones are segments of code that are similar according to some definition of similarity" by Baxter et al (1998) and "code clones [...] are code fragments of considerable length and significant similarity" by Basit and Jarzabek (2007). The latter definition identifies clones as long enough pieces of code that share sufficiently many traits, while the first has no such requirements.…”
Section: Related Workmentioning
confidence: 99%
“…This vagueness is reflected in the definitions " [c]lones are segments of code that are similar according to some definition of similarity" by Baxter et al (1998) and "code clones [...] are code fragments of considerable length and significant similarity" by Basit and Jarzabek (2007). The latter definition identifies clones as long enough pieces of code that share sufficiently many traits, while the first has no such requirements.…”
Section: Related Workmentioning
confidence: 99%
“…Even though Baker [2,4]also used the token scheme to detect the clone but it did not use any transformation technique resulting in detection of false positives. For more flexible tokenization RTF [7] used suffix array rather than suffix tree so that unnecessary tokens can be removed so as to reduce the false detection but this technique is more complex to implement.…”
Section: International Journal Of Computer Applications (0975 -8887) mentioning
confidence: 99%
“…Software Engineering: Identifying approximate clones of methods or classes in large software systems is important to software maintenance [4].…”
Section: Analysis Of Musical Textsmentioning
confidence: 99%
“…If M x,u is neither LE nor RE, we say that it is nonextendible (NE). In x = abaababa, the NE repeats are M x,a = (1; 1, 3,4,6,8) and M x,aba = (3; 1, 4, 6).…”
Section: Repeatsmentioning
confidence: 99%