Source code plagiarism detection using Running‐Karp‐Rabin Greedy‐String‐Tiling (RKRGST) is a common practice in academic environment. However, such approach is time‐inefficient (due to RKRGST's cubic time complexity) and insensitive (toward token subsequence rearrangement). This paper proposes ES‐Plag, a plagiarism detection tool featured with cosine‐based filtering and penalty mechanism to handle aforementioned issues. Cosine‐based filtering mitigates time‐inefficiency by excluding non‐potential pairs from RKRGST comparison; while penalty mechanism mitigates insensitivity by reducing the number of matched tokens with the number of matched subsequences prior similarity normalization. In addition to issue‐solving features, ES‐Plag is also featured with project‐based input, colorized adjacency similarity matrix, matched token highlighting, and various similarity algorithms (e.g., Cosine Similarity and Local Alignment). Three findings can be deducted from our evaluation. First, cosine‐based filtering boosts up time efficiency with a trade‐off in effectiveness. Second, penalty mechanism enhances sensitivity even though its improvement in terms of effectiveness is quite limited. Third, ES‐Plag's features are beneficial for examiners.
Low-level approach is a novel way to detect source code plagiarism. Such approach is proven to be effective when compared to baseline approach (i.e., an approach which relies on source code token subsequence matching) in controlled environment. We evaluate the effectiveness of state of the art in low-level approach based on Faidhi & Robinson's plagiarism level taxonomy; real plagiarism cases are employed as dataset in this work. Our evaluation shows that state of the art in low-level approach is effective to handle most plagiarism attacks. Further, it also outperforms its predecessor and baseline approach in most plagiarism levels.
-Based on the informal survey, learning algorithm time complexity in a theoretical manner can be rather difficult to understand. Therefore, this research proposed Complexitor, an educational tool for learning algorithm time complexity in a practical manner. Students could learn how to determine algorithm time complexity through the actual execution of algorithm implementation. They were only required to provide algorithm implementation (i.e. source code written on a particular programming language) and test cases to learn time complexity. After input was given, Complexitor generated execution sequence based on test cases and determine its time complexity through Pearson correlation. An algorithm time complexity with the highest correlation value toward execution sequence was assigned as its result. Based on the evaluation, it can be concluded this mechanism is quite effective for determining time complexity as long as the distribution of given input set is balanced.
Even though there are various source code plagiarism detection approaches, only a few works which are focused on low-level representation for deducting similarity. Most of them are only focused on lexical token sequence extracted from source code. In our point of view, low-level representation is more beneficial than lexical token since its form is more compact than the source code itself. It only considers semantic-preserving instructions and ignores many source code delimiter tokens. This paper proposes a source code plagiarism detection which rely on low-level representation. For a case study, we focus our work on .NET programming languages with Common Intermediate Language as its low-level representation. In addition, we also incorporate Adaptive Local Alignment for detecting similarity. According to Lim et al, this algorithm outperforms code similarity state-of-the-art algorithm (i.e. Greedy String Tiling) in term of effectiveness. According to our evaluation which involves various plagiarism attacks, our approach is more effective and efficient when compared with standard lexical-token approach.
Due to its high failure rate, Introductory Programming has become a main concern. One of the main issues is the incapability of slow-paced students to cope up with given programming materials. This paper proposes a learning technique which utilizes pair programming to help slow-paced students on Introductory Programming; each slow-paced student is paired with a fast-paced student and the latter is encouraged to teach the former as a part of grading system. An evaluation regarding that technique has been conducted on three undergraduate classes from an Indonesian university for the second semester of 2018. According to the evaluation, the use of pair programming may help both slow-paced and fast-paced students. Nevertheless, it may not significantly affect individual academic performance.
Despite the fact that plagiarizing source code is a trivial task for most CS students, detecting such unethical behavior requires a considerable amount of effort. Thus, several plagiarism detection systems were developed to handle such issue. This paper extends Karnalim's work, a low-level approach for detecting Java source code plagiarism, by incorporating abstract method linearization. Such extension is incorporated to enhance the accuracy of low-level approach in term of detecting plagiarism in object-oriented environment. According to our evaluation, which was conducted based on 23 design-pattern source code pairs, our extended low-level approach is more effective than state-of-the-art and Karnalim's approach. On the one hand, when compared to state-of-the-art approach, our approach can generate less coincidental similarities and provide more accurate result. On the other hand, when compared to Karnalim's approach, our approach, at some extent, can generate higher similarity when simple abstract method invocation is incorporated.
Keywords-abstract method linearization; low-level language; source code plagiarism detection; object-oriented environmentI.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.