Proceedings of the 38th International Conference on Software Engineering 2016
DOI: 10.1145/2884781.2884877
|View full text |Cite
|
Sign up to set email alerts
|

SourcererCC

Abstract: Despite a decade of active research, there is a marked lack in clone detectors that scale to very large repositories of source code, in particular for detecting near-miss clones where significant editing activities may take place in the cloned code. We present SourcererCC, a token-based clone detector that targets three clone types, and exploits an index to achieve scalability to large inter-project repositories using a standard workstation. SourcererCC uses an optimized invertedindex to quickly query the pote… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
22
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 320 publications
(22 citation statements)
references
References 44 publications
0
22
0
Order By: Relevance
“…We tested these alternative setups for search recall on the same micro-benchmark dataset. [Sajnani et al 2016] is a state-of-the-art clone detector that supports Type-3 clone detection. We wanted to compare Aroma with SourcererCC to examine whether a current-generation clone detector can be used as the light-weight search phase in Aroma.…”
Section: Comparison With Clone Detectors and Conventional Search Techmentioning
confidence: 99%
See 2 more Smart Citations
“…We tested these alternative setups for search recall on the same micro-benchmark dataset. [Sajnani et al 2016] is a state-of-the-art clone detector that supports Type-3 clone detection. We wanted to compare Aroma with SourcererCC to examine whether a current-generation clone detector can be used as the light-weight search phase in Aroma.…”
Section: Comparison With Clone Detectors and Conventional Search Techmentioning
confidence: 99%
“…Clone detectors are designed to detect syntactically identical or highly similar code. SourcererCC [Sajnani et al 2016] is a token-based clone detector targeting Type 1, 2, and 3 clones. Compared with other clone detectors that also support Type 3 clones, including NiCad [Cordy and Roy 2011], Deckard [Jiang et al 2007], and CCFinder , SourcererCC has high precision and recall and also scales to large-size projects.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…They also cannot detect changes within method bodies. Our work combines several existing tools and techniques, such as ChangeDistiller [22], SourcerCC [35], and Refactor-ingMiner [36] to get a comprehensive set of method-level changes.…”
Section: Related Workmentioning
confidence: 99%
“…Since we do not currently have access to proprietary vendorspecific code, we use a popular community-based variant of Android, LineageOS, as a proxy for a vendor-based version of Android. For each subsystem in AOSP, we track the method-level changes in the source code using a combination of existing code-evolution analysis tools: SourcererCC [35], ChangeDistiller [22], and Refac-toringMiner [36]. We are interested in changes in two directions.…”
Section: Introductionmentioning
confidence: 99%