2018
DOI: 10.22266/ijies2018.0630.28
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Document Representation Approach for Authorship Attribution

Abstract: Abstract:The rapidly growing data in the web result in stolen, unidentified and fraudulent data. Identification of such data is of a prime objective for forensic departments, researchers and governments.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 10 publications
0
3
0
Order By: Relevance
“…Combining ML and NLP techniques, such as multinomial naïve Bayes (MNB), support vector machine (SVM), expectation-maximization algorithm (EM), and stop words, lemmatization, and stemming [24], are used to identify fake reviews. Mekala et al [25] demonstrate high precision by utilizing the approach of stylistic characteristics and term weight measurement. Saha et al [26] achieved 96 percent accuracy using MLP on a dataset of social text from social media platforms.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Combining ML and NLP techniques, such as multinomial naïve Bayes (MNB), support vector machine (SVM), expectation-maximization algorithm (EM), and stop words, lemmatization, and stemming [24], are used to identify fake reviews. Mekala et al [25] demonstrate high precision by utilizing the approach of stylistic characteristics and term weight measurement. Saha et al [26] achieved 96 percent accuracy using MLP on a dataset of social text from social media platforms.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The formal study of authorship analysis started in the 19th century, it was first tackled with linguistic approaches and eventually by statistical and computational methods [3]. These tasks continue to grow attention for their practical applications; for example, in a variety of computer crime investigations ranging from homicide to identity theft and many types of financial crimes [4] or in the context of identifying the author of source code [5].…”
Section: Related Workmentioning
confidence: 99%
“…Recently, the latter has been combined with other stylometric features. For example, Sapkota et al (2014) used 13 stylometric features: number of sentences, number of tokens per sentence, number of punctuation marks per sentence, and so forth; Mekala et al (2018) extracted 39 stylometric features such as character count, block-letter words, and average sentence length in terms of characters/words; Wu et al (2021) combined four features of statistical style (i.e., average word/sentence length, letter frequency, numbers of 26 letters, and punctuation marks), three content features, two syntactic features, and one semantic feature to predict the author.…”
Section: • Sentence Lengthmentioning
confidence: 99%