Searching for better configurations: a rigorous approach to clone evaluation

Wang, Tiantian; Harman, Mark; Jia, Yunyi; Krinke, Jens

doi:10.1145/2491411.2491420

Cited by 129 publications

(109 citation statements)

References 37 publications

Supporting

Mentioning

105

Contrasting

Order By: Relevance

“…For example, ccfx's best configurations have a smaller b, minimum number of tokens, of 15 compared to the default value of 50 The results for both pervasively modified code and boiler-plate code show that the default configurations cannot offer the tools' their best performance. These empirical results support the findings of Wang et al (2013) that one cannot rely on the tools' default configurations. We suggest researchers and practitioners try their best to tune the tools before performing any benchmarking or comparisons of the tools' results to mitigate the threats to internal validity in their studies.…”

Section: Boiler-plate Codesupporting

confidence: 84%

“…For example, using the default settings for ccfx (b=50, t=12) leads to a very low F-score of 0.5781 due to a very high number of false negatives. Interestingly, a previous study on agreement of clone detectors (Wang et al 2013) observed the same difference between default and optimal configurations.…”

Section: Pervasively Modified Codementioning

confidence: 55%

“…In general, the selected tools require the optimisation of their parameters as these can affect the tools' execution behaviours and consequently their results. A previous study regarding parameter optimisation (Wang et al 2013) has explored only a small set of clone detectors' parameters using search-based techniques. Therefore, whilst including more tools in this study, we have also searched through a wider range of configurations for each tool, studied their impact, and discovered the best configurations for each data set in our experiments.…”

Section: "When Source Code Is Copied and Modified Which Code Similarmentioning

confidence: 99%

“…Thus, we adopted the General Clone Format (GCF) as a common format for clone reports. We modified and integrated the GCF Converter (Wang et al 2013) to convert clone reports generated by unsupported clone detectors into GCF format.…”

Section: Clone Detectorsmentioning

confidence: 99%

See 3 more Smart Citations

A comparison of code similarity analysers

2017

Self Cite

View full text Add to dashboard Cite

Copying and pasting of source code is a common activity in software engineering. Often, the code is not copied as it is and it may be modified for various purposes; e.g. refactoring, bug fixing, or even software plagiarism. These code modifications could affect the performance of code similarity analysers including code clone and plagiarism detectors to some certain degree. We are interested in two types of code modification in this study: pervasive modifications, i.e. transformations that may have a global effect, and local modifications, i.e. code changes that are contained in a single method or code block. We evaluate 30 code similarity detection techniques and tools using five experimental scenarios for Java source code. These are (1) pervasively modified code, created with tools for source code and bytecode obfuscation, and boiler-plate code, (2) source code normalisation through compilation and decompilation using different decompilers, (3) reuse of optimal configurations over different data sets, (4) tool evaluation using ranked-based measures, and (5) local + global code modifications. Our experimental results show that in the presence of pervasive modifications, some of the general textual similarity measures can offer similar performance to specialised code similarity tools, whilst in the presence of boiler-plate code, highly specialised source code similarity detection techniques and tools outperform textual similarity measures. Our study strongly validates the use of compilation/decompilation as a normalisation technique. Its use reduced false classifications to zero for three of the tools. Moreover, we demonstrate that optimal configurations are very sensitive to a specific data set. After directly applying optimal configurations derived from one data set to another, the tools perform poorly on the new data set. The code similarity analysers are thoroughly evaluated not only based on several well-known pair-based and query-based error measures but also on each specific type of pervasive code modification. This broad, thorough study is the largest in existence and potentially an invaluable guide for future users of similarity detection in source code.

show abstract

Section: Boiler-plate Codesupporting

confidence: 84%

Section: Pervasively Modified Codementioning

confidence: 55%

Section: "When Source Code Is Copied and Modified Which Code Similarmentioning

confidence: 99%

Section: Clone Detectorsmentioning

confidence: 99%

See 2 more Smart Citations

A comparison of code similarity analysers

2017

Self Cite

View full text Add to dashboard Cite

show abstract

“…These adjustments would be thought of as a configuration choice problem [7], and we will show a brief result of these adjustments in the next subsection.…”

Section: Resultsmentioning

confidence: 99%

Changes of Evaluation Values on Component Rank Model by Taking Code Clones into Consideration

Yokomori

Yoshida

Noro

et al. 2018

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYThere are many software systems that have been used and maintained for a long time. By undergoing such a maintenance process, similar code fragments were intentionally left in the source code of such software, and knowing how to manage a software system that contains a lot of similar code fragments becomes a major concern. In this study, we proposed a method to pick up components that were commonly used in similar code fragments from a target software system. This method was realized by using the component rank model and by checking the differences of evaluation values for each component before and after merging components that had similar code fragments. In many cases, components whose evaluation value had decreased would be used by both the components that were merged, so we considered that these components were commonly used in similar code fragments. Based on the proposed approach, we implemented a system to calculate differences of evaluation values for each component, and conducted three evaluation experiments to confirm that our method was useful for detecting components that were commonly used in similar code fragments, and to confirm how our method can help developers when developers add similar components. Based on the experimental results, we also discuss some improvement methods and provide the results from applications of these methods.

show abstract

TuneR: a framework for tuning software engineering tools with hands-on instructions in R

Borg

2016

J. Softw. Evol. and Proc.

View full text Add to dashboard Cite

SUMMARYNumerous tools automating various aspects of software engineering have been developed, and many of the tools are highly configurable through parameters. Understanding the parameters of advanced tools often requires deep understanding of complex algorithms. Unfortunately, sub-optimal parameter settings limit the performance of tools and hinder industrial adaptation, but still few studies address the challenge of tuning software engineering tools. We present TuneR, an experiment framework that supports finding feasible parameter settings using empirical methods. The framework is accompanied by practical guidelines of how to use R to analyze the experimental outcome. As a proof-of-concept, we apply TuneR to tune ImpRec, a recommendation system for change impact analysis in a software system that has evolved for more than two decades. Compared to the output from the default setting, we report a 20.9% improvement in the response variable reflecting recommendation accuracy. Moreover, TuneR reveals insights into the interaction among parameters, as well as non-linear effects. TuneR is easy to use, thus the framework has potential to support tuning of software engineering tools in both academia and industry.

show abstract

Searching for better configurations: a rigorous approach to clone evaluation

Cited by 129 publications

References 37 publications

A comparison of code similarity analysers

A comparison of code similarity analysers

Changes of Evaluation Values on Component Rank Model by Taking Code Clones into Consideration

TuneR: a framework for tuning software engineering tools with hands-on instructions in R

Contact Info

Product

Resources

About