Proceedings of the 2017 Federated Conference on Computer Science and Information Systems 2017
DOI: 10.15439/2017f110
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of Authorship Attribution Approaches Applied on the Lithuanian Language

Abstract: Abstract-This paper reports comparative authorship attribution results obtained on the Internet comments of the morphologically complex Lithuanian language. We have explored the impact of machine learning and similarity-based approaches on the different author set sizes (containing 10, 100, and 1,000 candidate authors), feature types (lexical, morphological, and character), and feature selection techniques (feature ranking, random selection). The authorship attribution task was complicated due to the used Lith… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
2
2
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…With the recent increase in demand for various Natural Language Processing (NLP) technologies, such as chatbots [3], content classification [4], Sentiment Analysis [5][6][7], hate speech detection [8,9], authorship recognition and attribution [10], product and service recommenders [11,12], text summarization [13,14], email spam detection [15] and phishing detection [16], intent detection [17], and search optimization [18], ML models have presented a huge advantage and have created many opportunities for researchers in the field of text classification.…”
Section: Introductionmentioning
confidence: 99%
“…With the recent increase in demand for various Natural Language Processing (NLP) technologies, such as chatbots [3], content classification [4], Sentiment Analysis [5][6][7], hate speech detection [8,9], authorship recognition and attribution [10], product and service recommenders [11,12], text summarization [13,14], email spam detection [15] and phishing detection [16], intent detection [17], and search optimization [18], ML models have presented a huge advantage and have created many opportunities for researchers in the field of text classification.…”
Section: Introductionmentioning
confidence: 99%
“…TC is a machine learning challenge that tries to classify new written content into a conceptual group from a predetermined classification collection [1]. It is crucial in a variety of applications, including sentiment analysis [2,3], spam email filtering [4,5], hate speech detection [6], text summarization [7], website classification [8], authorship attribution [9], information retrieval [10], medical diagnostics [11], emotion detection on smart phones [12], online recommendations [13], fake news detection [14,15], crypto-ransomware early detection [16], semantic similarity detection [17], part-of-speech tagging [18], news classification [19], and tweet classification [20].…”
Section: Introductionmentioning
confidence: 99%
“…In the whole area of authorship identification, authorship attribution is the most explored topic for the morphologically complex Lithuanian language (the recent research work is described in [15], [16]). Unfortunately, the deep learning methods have never been applied on the Lithuanian language in any of these tasks, including AP.…”
Section: Introduction and Related Workmentioning
confidence: 99%