Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing 2017
DOI: 10.18653/v1/w17-1411
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of Short-Text Sentiment Analysis Methods for Croatian

Abstract: We focus on the task of supervised sentiment classification of short and informal texts in Croatian, using two simple yet effective methods: word embeddings and string kernels. We investigate whether word embeddings offer any advantage over corpus-and preprocessing-free string kernels, and how these compare to bag-ofwords baselines. We conduct a comparison on three different datasets, using different preprocessing methods and kernel functions. Results show that, on two out of three datasets, word embeddings ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(9 citation statements)
references
References 28 publications
(21 reference statements)
0
9
0
Order By: Relevance
“…Of course, due to same previously mentioned reasons about the Lithuanian language specifics (in Section 2.2), we cannot expect the accuracy for the Lithuanian language above 80% (as it is reported in most sentiment analysis research works for the English language). For instance, the sentiment classification on the morphologically complex Croatian language with SVM method reaches~57% of the f-score on the general topic and~72% on the specific domain [59]. The authors in [60] solved the sentiment analysis task for Czech on the Facebook posts with Maximum Entropy and SVM and the best achieved f-score is 69%.…”
Section: Discussionmentioning
confidence: 99%
“…Of course, due to same previously mentioned reasons about the Lithuanian language specifics (in Section 2.2), we cannot expect the accuracy for the Lithuanian language above 80% (as it is reported in most sentiment analysis research works for the English language). For instance, the sentiment classification on the morphologically complex Croatian language with SVM method reaches~57% of the f-score on the general topic and~72% on the specific domain [59]. The authors in [60] solved the sentiment analysis task for Czech on the Facebook posts with Maximum Entropy and SVM and the best achieved f-score is 69%.…”
Section: Discussionmentioning
confidence: 99%
“…The short-text sentiment analysis has attracted wide attention [15], [16]. The sparsity issue of short text data makes it difficult to learn good word representation for researches in this field.…”
Section: Related Workmentioning
confidence: 99%
“…Using a combination of unigrams and bigrams, they achieved a statistically significant improvement compared to the baseline method for two classes (accuracy 86.11%) and for three classes (accuracy 63.02%). A comparative overview of the method of analyzing sentiment for the Croatian language is given in [22]. Due to the great similarity of the Croatian and the Serbian language in the complex-morphological sense and in other characteristics, we can consider that the results achieved are similar to those in the Serbian language.…”
Section: Sentiment Analysis In Serbianmentioning
confidence: 99%