The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2017
DOI: 10.1155/2017/7809047
|View full text |Cite
|
Sign up to set email alerts
|

WASTK: A Weighted Abstract Syntax Tree Kernel Method for Source Code Plagiarism Detection

Abstract: In this paper, we introduce a source code plagiarism detection method, named WASTK (Weighted Abstract Syntax Tree Kernel), for computer science education. Different from other plagiarism detection methods, WASTK takes some aspects other than the similarity between programs into account. WASTK firstly transfers the source code of a program to an abstract syntax tree and then gets the similarity by calculating the tree kernel of two abstract syntax trees. To avoid misjudgment caused by trivial code snippets or f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
29
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 35 publications
(29 citation statements)
references
References 13 publications
0
29
0
Order By: Relevance
“…These techniques rely on various structures for comparison. Some of the structures are source code token strings [8,29,41], abstract syntax trees [20,30,53], parse trees [48], program dependency graphs [32], and low-level token strings [25,42].…”
Section: Related Workmentioning
confidence: 99%
“…These techniques rely on various structures for comparison. Some of the structures are source code token strings [8,29,41], abstract syntax trees [20,30,53], parse trees [48], program dependency graphs [32], and low-level token strings [25,42].…”
Section: Related Workmentioning
confidence: 99%
“…Source code token sequence [19][20][21] is a sequence of meaningful "words" from source code; it is usually extracted with the help of a programming language parser. Abstract syntax tree [22] is a tree where the nodes are formed from tokens and the edges are formed from programming language grammar. Program dependency graph [23] is a graph connecting several instructions based on their execution dependencies.…”
Section: Related Workmentioning
confidence: 99%
“…Structure-based tool defines source code similarity based on given codes' shared structure. The structure can be either source code token sequence [2], [15], [16], low-level token sequence [17], [18], program dependency graph [19] or abstract syntax tree [20]. The similarity of the first two structures are commonly measured by string matching algorithms (e.g., Running-Karp-Rabin Greedy-String-Tiling [5]), that have been modified to handle source code tokens instead of characters.…”
Section: Literature Reviewmentioning
confidence: 99%