2015
DOI: 10.1007/978-3-319-24770-0_37
|View full text |Cite
|
Sign up to set email alerts
|

Authorship Attribution of Internet Comments with Thousand Candidate Authors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 31 publications
0
4
0
Order By: Relevance
“…They noted that the performance of set of rich linguistic features was better for author prediction when compared with word frequencies and trigrams of characters. Another researchers obtained [7] best results when combination of word based and character tetragrams features are used. In [8], the researchers extracted POS bigrams and trigrams, character trigrams, percentage of direct speech from the documents and syntactic features.…”
Section: Literature Surveymentioning
confidence: 98%
“…They noted that the performance of set of rich linguistic features was better for author prediction when compared with word frequencies and trigrams of characters. Another researchers obtained [7] best results when combination of word based and character tetragrams features are used. In [8], the researchers extracted POS bigrams and trigrams, character trigrams, percentage of direct speech from the documents and syntactic features.…”
Section: Literature Surveymentioning
confidence: 98%
“…Despite for the Lithuanian language there are done: 1) lots of descriptive research works (e.g., [14], [15]); 2) some experiments with machine learning (carried out on parliamentary transcripts or forum posts of only 100 candidate authors) [16] or similarity-based approaches (using very limited training data) [17]; these findings do not guarantee the best results for our solving AA task. Our aim is at performing the comparative investigation and at finding the best method, feature type, and feature selection technique for our AA task (with 10, 100, and 1,000 candidate authors) on the corpus of the Lithuanian Internet comments.…”
Section: Related Workmentioning
confidence: 99%
“…The SB-RFS technique is adjusted to cope with very concise texts; performs especially well on a small number of features, because the final attribution decision incorporates the generalized results of several decisions obtained during a few iterations. In our experiments we used SB-TopN and SB-RFS implementations presented in [17].…”
Section: Proceedings Of the Fedcsis Prague 2017mentioning
confidence: 99%
See 1 more Smart Citation