Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of 2018
DOI: 10.1145/3236024.3236026
|View full text |Cite
|
Sign up to set email alerts
|

Oreo: detection of clones in the twilight zone

Abstract: Source code clones are categorized into four types of increasing difficulty of detection, ranging from purely textual (Type-1) to purely semantic (Type-4). Most clone detectors reported in the literature work well up to Type-3, which accounts for syntactic differences. In between Type-3 and Type-4, however, there lies a spectrum of clones that, although still exhibiting some syntactic similarities, are extremely hard to detect -the Twilight Zone. Most clone detectors reported in the literature fail to operate … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
98
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 138 publications
(99 citation statements)
references
References 56 publications
0
98
0
Order By: Relevance
“…BigCloneBench is a benchmark which contains different types of manually validated clones in the repository IJaDataset-2.0 [21] and it defines clone types by syntactic similarity as described in Section II. The framework BigCloneEval [22] summarizes recall performance for different clone types of clone detectors automatically and it is widely used in previous work [4], [6]. We configured the BigCloneEval with minimum clone size 6 lines and 50 tokens which are consistent with the standard minimum clone size.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…BigCloneBench is a benchmark which contains different types of manually validated clones in the repository IJaDataset-2.0 [21] and it defines clone types by syntactic similarity as described in Section II. The framework BigCloneEval [22] summarizes recall performance for different clone types of clone detectors automatically and it is widely used in previous work [4], [6]. We configured the BigCloneEval with minimum clone size 6 lines and 50 tokens which are consistent with the standard minimum clone size.…”
Section: Discussionmentioning
confidence: 99%
“…Deckard [5] builds the characteristic vectors from abstract syntax tree (AST) to detect clones, but suffers from low precision and recall rate. Deep learning methods such as Oreo [6] encode software metrics into semantic vectors and achieve good results, but they mainly focus on semantic clones. For these considerations, we present a tool aimed at detecting large-variance code clones called LVMapper.…”
Section: Introductionmentioning
confidence: 99%
“…More details about these subcategories can be found elsewhere [19]. Action Token: Action tokens of a method are the tokens corresponding to the methods called and class fields accessed by that method [8]. Additionally, the array accesses made by a method are also special Action tokens namely ArrayAccess and ArrayAccessBinary, where array access of kind arr[i] is an Ar-rayAccess Action token and arr[i+1] is an ArrayAccessBinary Action token.…”
Section: A Definitionsmentioning
confidence: 99%
“…Hence, we use 24 method level software metrics shown in Table I for Type II resolution. The details of these metrics can be found elsewhere [8], [23]. A detailed explanation about the application of Action tokens and software metrics in clone detection can be found in [8].…”
Section: Automatic Resolution Of Type II Clonesmentioning
confidence: 99%
See 1 more Smart Citation