Comparison of Short-Text Sentiment Analysis Methods for Croatian

Rotim, Leon; Šnajder, Jan

doi:10.18653/v1/w17-1411

Cited by 12 publications

(9 citation statements)

References 28 publications

(21 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Of course, due to same previously mentioned reasons about the Lithuanian language specifics (in Section 2.2), we cannot expect the accuracy for the Lithuanian language above 80% (as it is reported in most sentiment analysis research works for the English language). For instance, the sentiment classification on the morphologically complex Croatian language with SVM method reaches~57% of the f-score on the general topic and~72% on the specific domain [59]. The authors in [60] solved the sentiment analysis task for Czech on the Facebook posts with Maximum Entropy and SVM and the best achieved f-score is 69%.…”

Section: Discussionmentioning

confidence: 99%

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

2019

View full text Add to dashboard Cite

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Short-Term Memory—LSTM and Convolutional Neural Network—CNN) approaches. The traditional machine learning techniques were used with the features based on the lexical, morphological, and character information. The deep learning approaches were applied on the top of two types of word embeddings (Vord2Vec continuous bag-of-words with negative sampling and FastText). Both traditional and deep learning approaches had to solve the positive/negative/neutral sentiment classification task on the balanced and full dataset versions. The best deep learning results (reaching 0.706 of accuracy) were achieved on the full dataset with CNN applied on top of the FastText embeddings, replaced emoticons, and eliminated diacritics. The traditional machine learning approaches demonstrated the best performance (0.735 of accuracy) on the full dataset with the NBM method, replaced emoticons, restored diacritics, and lemma unigrams as features. Although traditional machine learning approaches were superior when compared to the deep learning methods; deep learning demonstrated good results when applied on the small datasets.

show abstract

Section: Discussionmentioning

confidence: 99%

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

2019

View full text Add to dashboard Cite

show abstract

“…The short-text sentiment analysis has attracted wide attention [15], [16]. The sparsity issue of short text data makes it difficult to learn good word representation for researches in this field.…”

Section: Related Workmentioning

confidence: 99%

Bi-Level Attention Model for Sentiment Analysis of Short Texts

Liu

Yin

2019

IEEE Access

View full text Add to dashboard Cite

Short text is an important form of information dissemination and opinion expression in various social media platforms. Sentiment analysis of short texts is beneficial for the understanding of customers' emotional state, obtaining customers' opinions and attitudes toward events, information and products, however, is difficult because the sparsity of the short-text data. Unlike the traditional methods using the external knowledge, this paper proposes a bi-level attention model for sentiment analysis of short texts, which does not rely on external knowledge to deal with the data sparsity. Specifically, at word level, our model improves the effect of word representation by introducing latent topic information into word-level semantic representation. Neural topic model is used to discover the latent topic of the text. A new topic-word attention mechanism is presented to explore the semantics of words from the perspective of topic-word association; At the sequence level, a secondary attention mechanism is used to capture the relationship between local and global sentiment expression. Experiments on the ChnSentiCorp-Htl-ba-10000 and NLPCC-ECGC datasets validate the effectiveness of the BAM model. INDEX TERMS Attention mechanism, sentiment analysis, text analysis topic model, word embedding. A. SPARSITY OF SHORT-TEXT DATA At present, two main methods address data sparsity in shorttext sentiment analysis [1], [17]: introduction of external knowledge to assist sentiment analysis and dimensionality reduction [1], [5].

show abstract

“…Using a combination of unigrams and bigrams, they achieved a statistically significant improvement compared to the baseline method for two classes (accuracy 86.11%) and for three classes (accuracy 63.02%). A comparative overview of the method of analyzing sentiment for the Croatian language is given in [22]. Due to the great similarity of the Croatian and the Serbian language in the complex-morphological sense and in other characteristics, we can consider that the results achieved are similar to those in the Serbian language.…”

Section: Sentiment Analysis In Serbianmentioning

confidence: 99%

Improving sentiment analysis for twitter data by handling negation rules in the Serbian language

Ljajić

Marovac

2019

ComSIS

View full text Add to dashboard Cite

The importance of determining sentiment for short text increases with the rise in the number of comments on social networks. The presence of negation in these texts affects their sentiment, because it has a greater range of action in proportion to the length of the text. In this paper, we examine how the treatment of negation impacts the sentiment of tweets in the Serbian language. The grammatical rules that influence the change of polarity are processed. We performed an analysis of the effect of the negation treatment on the overall process of sentiment analysis. A statistically significant relative improvement was obtained (up to 31.16% or up to 2.65%) when the negation was processed using our rules with the lexicon-based approach or machine learning methods. By applying machine learning methods, an accuracy of 68.84% was achieved on a set of positive, negative and neutral tweets, and an accuracy of as much as 91.13% when applied to the set of positive and negative tweets.

show abstract

Comparison of Short-Text Sentiment Analysis Methods for Croatian

Cited by 12 publications

References 28 publications

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

Sentiment Analysis of Lithuanian Texts Using Traditional and Deep Learning Approaches

Bi-Level Attention Model for Sentiment Analysis of Short Texts

Improving sentiment analysis for twitter data by handling negation rules in the Serbian language

Contact Info

Product

Resources

About