Detecting Clones Across Microsoft .NET Programming Languages

Al-Omari, Farouq; Keivanloo, Iman; Roy, Chanchal K.; Rilling, Juergen

doi:10.1109/wcre.2012.50

Cited by 31 publications

(25 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They implemented a tool called C2D2 based on the CodeDOM library in the Microsoft .NET framework, which uses NRefactory Library to generate the Unified CodeDOM graph for both C# and VB.NET. Al-omari et al [10] present a clone detection approach for the .NET language family too, based on the Common Intermediate Language (CIL). It can detect cross-language clone pairs in C#, J#, and VB.NET.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

<i>CLCMiner</i>: Detecting Cross-Language Clones without Intermediates

Cheng

Peng

Jiang

et al. 2017

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYThe proliferation of diverse kinds of programming languages and platforms makes it a common need to have the same functionality implemented in different languages for different platforms, such as Java for Android applications and C# for Windows phone applications. Although versions of code written in different languages appear syntactically quite different from each other, they are intended to implement the same software and typically contain many code snippets that implement similar functionalities, which we call cross-language clones. When the version of code in one language evolves according to changing functionality requirements and/or bug fixes, its cross-language clones may also need be changed to maintain consistent implementations for the same functionality. Thus, it is needed to have automated ways to locate and track cross-language clones within the evolving software. In the literature, approaches for detecting cross-language clones are only for languages that share a common intermediate language (such as the .NET language family) because they are built on techniques for detecting single-language clones. To extend the capability of cross-language clone detection to more diverse kinds of languages, we propose a novel automated approach, CLCMiner, without the need of an intermediate language. It mines such clones from revision histories, based on our assumption that revisions to different versions of code implemented in different languages may naturally reflect how programmers change cross-language clones in practice, and that similarities among the revisions (referred to as clones in diffs or diff clones) may indicate actual similar code. We have implemented a prototype and applied it to ten open source projects implementations in both Java and C#. The reported clones that occur in revision histories are of high precisions (89% on average) and recalls (95% on average). Compared with token-based code clone detection tools that can treat code as plain texts, our tool can detect significantly more cross-language clones. All the evaluation results demonstrate the feasibility of revision-history based techniques for detecting cross-language clones without intermediates and point to promising future work.

show abstract

Section: Related Workmentioning

confidence: 99%

“…A number of researchers [3], [10] have started to detect crosslanguage code clones too. However, their approaches are limited to detect clones in the .NET language family that share a common intermediate language.…”

mentioning

confidence: 99%

<i>CLCMiner</i>: Detecting Cross-Language Clones without Intermediates

Cheng

Peng

Jiang

et al. 2017

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…The core of the algorithm uses a hash function to generate simhash values. Among various non-cryptographic hash functions we use Jenkin hash function since it shows better similarity preserving behaviour compared to other functions and also found effective in detecting nearmiss code fragments in other studies [1], [29], [30]. We generate a 64 bit simhash value for both context and content using the simhash algorithm [25].…”

Section: Generate Candidate Listmentioning

confidence: 99%

LHDiff: A Language-Independent Hybrid Approach for Tracking Source Code Lines

Asaduzzaman

Roy

Schneider

et al. 2013

2013 IEEE International Conference on Software Maintenance

Self Cite

View full text Add to dashboard Cite

Tracking source code lines between two different versions of a file is a fundamental step for solving a number of important problems in software maintenance such as locating bug introducing changes, tracking code fragments or defects across versions, merging file versions, and software evolution analysis. Although a number of such approaches are available in the literature, their performance is sensitive to the kind and degree of source code changes. There is also a marked lack of study on the effect of change types on source location tracking techniques. In this paper, we propose a language-independent technique, LHDiff, for tracking source code lines across versions that leverages simhash technique together with heuristics to improve accuracy. We evaluate our approach against state-of-theart techniques using benchmarks containing different degrees of changes where files are selected from real world applications. We further evaluate LHDiff with other techniques using a mutation based analysis to understand how different types of changes affect their performance. The results reveal that our technique is more effective than language-independent approaches and no worse than some language-dependent techniques. In our study LHDiff even shows better performance than a state-of-the-art languagedependent approach. In addition, we also discuss limitations of different line tracking techniques including ours and propose future research directions.

show abstract

“…Existing single-language clone detection techniques are unable to effectively detect these sorts of cross-language clones. In this paper we propose a method to detect cross-language clones and demonstrate that it (1) finds cross-language clones that no existing method can detect; and (2) performs comparably to existing single-language clone detectors for finding clones within a corpus of single-language code sources. Therefore, our technique generalizes A JavaScript (left) and Java (right) clone pair setting the weight and inverse weight of a particle in a graphics application.…”

Section: Introductionmentioning

confidence: 99%

“…2). That initial work has either focused on clones across languages that share a common intermediate representation such as .NET [1,15] or has deviated from classical clone detection and taken a more restricted, natural languagebased approach, sometimes relying on assumptions that may not be met in real code [7,8]. None of that existing work would detect the clone examples given in Figs.…”

Section: Introductionmentioning

confidence: 99%

Structural and Nominal Cross-Language Clone Detection

Nichols

Emre

Hardekopf

2019

Fundamental Approaches to Software Engineering

View full text Add to dashboard Cite

In this paper we address the challenge of cross-language clone detection. Due to the rise of cross-language libraries and applications (e.g., apps written for both Android and iPhone), it has become common for code fragments in one language to be ported over into another language in an extension of the usual "copy and paste" coding methodology. As with single-language clones, it is important to be able to detect these cross-language clones. However there are many real-world crosslanguage clones that existing techniques cannot detect. We describe the first general, cross-language algorithm that combines both structural and nominal similarity to find syntactic clones, thereby enabling more complete clone detection than any existing technique. This algorithm also performs comparably to the state of the art in singlelanguage clone detection when applied to single-language source code; thus it generalizes the state of the art in clone detection to detect both single-and cross-language clones using one technique.

show abstract

Detecting Clones Across Microsoft .NET Programming Languages

Cited by 31 publications

References 18 publications

<i>CLCMiner</i>: Detecting Cross-Language Clones without Intermediates

<i>CLCMiner</i>: Detecting Cross-Language Clones without Intermediates

LHDiff: A Language-Independent Hybrid Approach for Tracking Source Code Lines

Structural and Nominal Cross-Language Clone Detection

Contact Info

Product

Resources

About