2019
DOI: 10.1007/s10579-019-09444-w
|View full text |Cite
|
Sign up to set email alerts
|

On the use of character n-grams as the only intrinsic evidence of plagiarism

Abstract: When a shift in writing style is noticed in a document, doubts arise about its originality. Based on this clue to plagiarism, the intrinsic approach to plagiarism detection identifies the stolen passages by analysing the writing style of the suspicious document without comparing it to textual resources that may serve as sources for the plagiarist. Character n-grams are recognised as a successful approach to modelling text for writing style analysis. Although prior studies have investigated the best practice of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 30 publications
0
4
0
Order By: Relevance
“…So, the ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š profile of a given text is defined as the set containing all the ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š๐‘  of a predetermined length, ๐‘›, beside their frequency. The work in [13] summarizes the approaches for detecting intrinsic plagiarism where character ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š๐‘  are used.…”
Section: Related Workmentioning
confidence: 99%
“…So, the ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š profile of a given text is defined as the set containing all the ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š๐‘  of a predetermined length, ๐‘›, beside their frequency. The work in [13] summarizes the approaches for detecting intrinsic plagiarism where character ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š๐‘  are used.…”
Section: Related Workmentioning
confidence: 99%
“…The set of all the ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š๐‘  of a predefined length, ๐‘›, extracted along with their frequencies from a given text, is referred to as the text's ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š profile. Methods for intrinsic plagiarism detection; wherein character ๐‘› โˆ’ ๐‘”๐‘Ÿ๐‘Ž๐‘š๐‘  are used, were summarized in [13].…”
Section: Related Workmentioning
confidence: 99%
“…In this section we provide a brief overview of various approaches proposed for the detection of plagiarism and paraphrase plagiarism. In particular, approaches based on character and word n-gram similarity (Bensalem et al 2019;Sรกnchez-Vega et al 2017), vector space models (Sanchez-Perez et al 2014), natural language processing (Chong 2013;Kanjirangat and Gupta 2018) machine translation similarity metrics (Madnani et al 2012) and alignment algorithms (Nichols et al 2019) have been successfully applied towards plagiarism detection. Despite these advances, plagiarism detection when text has been paraphrased remains a challenge due to limited success in measuring semantic overlap (Carmona et al 2018).…”
Section: Plagiarism Detectionmentioning
confidence: 99%