Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering 2013
DOI: 10.1145/2491411.2491420
|View full text |Cite
|
Sign up to set email alerts
|

Searching for better configurations: a rigorous approach to clone evaluation

Abstract: Clone detection finds application in many software engineering activities such as comprehension and refactoring. However, the confounding configuration choice problem poses a widely-acknowledged threat to the validity of previous empirical analyses. We introduce desktop and parallelised cloud-deployed versions of a search based solution that finds suitable configurations for empirical studies. We evaluate our approach on 6 widely used clone detection tools applied to the Bellon suite of 8 subject systems. Our … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
105
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 129 publications
(109 citation statements)
references
References 37 publications
4
105
0
Order By: Relevance
“…For example, ccfx's best configurations have a smaller b, minimum number of tokens, of 15 compared to the default value of 50 The results for both pervasively modified code and boiler-plate code show that the default configurations cannot offer the tools' their best performance. These empirical results support the findings of Wang et al (2013) that one cannot rely on the tools' default configurations. We suggest researchers and practitioners try their best to tune the tools before performing any benchmarking or comparisons of the tools' results to mitigate the threats to internal validity in their studies.…”
Section: Boiler-plate Codesupporting
confidence: 84%
See 3 more Smart Citations
“…For example, ccfx's best configurations have a smaller b, minimum number of tokens, of 15 compared to the default value of 50 The results for both pervasively modified code and boiler-plate code show that the default configurations cannot offer the tools' their best performance. These empirical results support the findings of Wang et al (2013) that one cannot rely on the tools' default configurations. We suggest researchers and practitioners try their best to tune the tools before performing any benchmarking or comparisons of the tools' results to mitigate the threats to internal validity in their studies.…”
Section: Boiler-plate Codesupporting
confidence: 84%
“…For example, using the default settings for ccfx (b=50, t=12) leads to a very low F-score of 0.5781 due to a very high number of false negatives. Interestingly, a previous study on agreement of clone detectors (Wang et al 2013) observed the same difference between default and optimal configurations.…”
Section: Pervasively Modified Codementioning
confidence: 55%
See 2 more Smart Citations
“…These adjustments would be thought of as a configuration choice problem [7], and we will show a brief result of these adjustments in the next subsection.…”
Section: Resultsmentioning
confidence: 99%