2019
DOI: 10.5539/ijel.v9n5p182
|View full text |Cite
|
Sign up to set email alerts
|

Towards a Linguistic Stylometric Model for the Authorship Detection in Cybercrime Investigations

Abstract: This study proposes an integrated framework that considers letter-pair frequencies/combinations along with the lexical features of documents as a means to identifying the authorship of short texts posted anonymously on social media. Taking a quantitative morpho-lexical approach, this study tests the hypothesis that letter information, or mapping, can identify unique stylistic features. As such, stable word combinations and morphological patterns can be used successfully for authorship detection in relation to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0
1

Year Published

2021
2021
2022
2022

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 19 publications
0
3
0
1
Order By: Relevance
“…The approach is widely used due to its conceptual simplicity and ease of determining semantic similarity within documents (Zhiguo, Luo, Chen, Wang & Lei, 2011). One problem with VSC, however, is that it cannot deal with short documents effectively due to sparsity (Amensisa, Patil & Agrawal, 2018;Moisl & Maguire, 2008;Omar & Aldawsari, 2019). Given the nature of the lyrics in this study, the GSDMM technique developed by Yin and Wang (Yin & Wang, 2014) was selected.…”
Section: Methodsmentioning
confidence: 99%
“…The approach is widely used due to its conceptual simplicity and ease of determining semantic similarity within documents (Zhiguo, Luo, Chen, Wang & Lei, 2011). One problem with VSC, however, is that it cannot deal with short documents effectively due to sparsity (Amensisa, Patil & Agrawal, 2018;Moisl & Maguire, 2008;Omar & Aldawsari, 2019). Given the nature of the lyrics in this study, the GSDMM technique developed by Yin and Wang (Yin & Wang, 2014) was selected.…”
Section: Methodsmentioning
confidence: 99%
“…For example, Ishihara (2017) demonstrated how forensic text comparison could be used on chat conversations of various lengths from 500 to 2500 tokens. Omar and Deraan (2019) found that the inclusion of different variables into an integrated system leads to improved Authorship Attribution performance on short texts. A combination of analysing lexical features and letter-pair frequencies resulted in an accuracy of 76 %.…”
Section: The Linguistics Of Groomingmentioning
confidence: 99%
“…Egy megfelelően felépített anonim profil esetében további, különleges szakismeretekre is szükség lehet annak érdekében, hogy azt egy konkrét személyhez lehessen kötni. Jó példa lehet erre a felhasználó nyelvhasználati szokásainak elemzése, mely a társadalmi, közösségi hovatartozására, származására, képzettségére vonatkozó információkkal is szolgálhat (Omar, 2019).…”
Section: Nyílt Forrású Adatgyűjtés -A Megoldás?unclassified