2018
DOI: 10.1002/cpe.5000
|View full text |Cite
|
Sign up to set email alerts
|

Software plagiarism detection in multiprogramming languages using machine learning approach

Abstract: Summary The Software plagiarism, which arises the problem of software piracy is a growing major concern nowadays. It is a serious risk to the software industry that gives huge economic damages every year. The customers may develop a modified version of the original software in other types of programming languages. Furthermore, the plagiarism detection in different types of source codes is a challenging task because each source code may have specific syntax rules. In this paper, we proposed a methodology for so… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
17
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 21 publications
(17 citation statements)
references
References 27 publications
0
17
0
Order By: Relevance
“…Arguing that these metrics were not representative, the metrics was then extended by considering more factors [3,16] in which some of them are contextual (like source code tokens) [4,23]. The similarity measurement was also updated by adopting algorithms from other domains such as information retrieval [12,18,26], clustering [1,24,36], and classification [50,56]. Occasionally, algorithms from some of these domains were applied altogether [14,46].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Arguing that these metrics were not representative, the metrics was then extended by considering more factors [3,16] in which some of them are contextual (like source code tokens) [4,23]. The similarity measurement was also updated by adopting algorithms from other domains such as information retrieval [12,18,26], clustering [1,24,36], and classification [50,56]. Occasionally, algorithms from some of these domains were applied altogether [14,46].…”
Section: Related Workmentioning
confidence: 99%
“…Classification algorithms is also applicable for language-independent techniques in which one instance refers to a source code pair and its target class defines whether the pair is suspicious. Ullah et al [50] consider source code words as learning features and, therefore, raise the suspicion for some pairs with the help of multinomial logistic regression [6]. Yasaswi, Purini, and Jawahar [56] classify the suspicion through a character-level language model and a support-vector machine [10].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, the plagiarism detection in different types of source codes is a challenging task because each source code may have specific syntax rules. In the contribution by Ullah et al, “Software Plagiarism Detection in Multiprogramming Languages using Machine Learning Approach,” the authors proposed a methodology for software plagiarism detection in multiprogramming languages based on machine learning approaches 8 . The Principal Component Analysis (PCA) is applied for features extraction from source codes without losing the actual information.…”
Section: Themes Of This Special Issuementioning
confidence: 99%
“…The software piracy is the development of software by reusing source codes illegally from someone else's work and disguise as the original version. The cracker may copy the logic of the original software by reverse engineering procedures and then design the same logic in another type of source codes [6]. It is a severe threat to internet security, which gives access to unlimited downloads of pirated software, open-source codes and, promotes and advertises of pirated versions.…”
Section: Introductionmentioning
confidence: 99%