Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts

Nakov, Preslav; Rosenthal, Sara; Kiritchenko, Svetlana; Mohammad, Saif M.; Kozareva, Zornitsa; Ritter, Alan; Stoyanov, Veselin; Zhu, Xiaodan

doi:10.1007/s10579-015-9328-1

Cited by 87 publications

(65 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Second, the systematic removal of controversial (or, high-disagreement) data (Saif et al, 2013;Nakov et al, 2016). We argue that this tendency is problematic because any automatic sentiment analysis system to be implemented in a real-world setting cannot know a priori which tweets will be "noisy" or "controversial".…”

Section: Summary Of Tsa Problemsmentioning

confidence: 97%

“…There are a variety of methods for constructing TSA datasets along a variety of domains, ranging from very specific (e.g., OMD (Shamma et al, 2009)) to general (e.g., SemEval 2013-2014 (Nakov et al, 2016)). While there is the popular Stanford Twitter corpus, constructed with noisy labellings (Go et al, 2009), the more common method of constructing TSA datasets relies on manual annotation (usually crowd-sourced) of tweet sentiment to establish gold-standard labellings according to a pre-defined set of possible label categories (often POSITIVE, NEGATIVE, and NEUTRAL) (Shamma et al, 2009;Speriosu et al, 2011;Thelwall et al, 2012;Saif et al, 2013;Nakov et al, 2016;Rosenthal et al, 2017).…”

Section: Current Problems In Tsamentioning

confidence: 99%

“…Nonetheless, most work on this dataset filters out tweets with less than two-thirds agreement (Speriosu et al, 2011;Saif et al, 2013) (Table 2). Unfortunately, many later dataset releases have not followed the example of the OMD; the designers of such datasets have opted instead to release only the resultant labelling according to a motivated (but constraining) label-assignment schema, often removing tweets with high inter-annotator disagreement from the final dataset release (Saif et al, 2013;Nakov et al, 2016;Rosenthal et al, 2017).…”

Section: Current Problems In Tsamentioning

confidence: 99%

See 2 more Smart Citations

Sentiment Analysis: It’s Complicated!

Kenyon-Dean

Ahmed²,

Fujimoto

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

View full text Add to dashboard Cite

Sentiment analysis is used as a proxy to measure human emotion, where the objective is to categorize text according to some predefined notion of sentiment. Sentiment analysis datasets are typically constructed with gold-standard sentiment labels, assigned based on the results of manual annotations. When working with such annotations, it is common for dataset constructors to discard "noisy" or "controversial" data where there is significant disagreement on the proper label. In datasets constructed for the purpose of Twitter sentiment analysis (TSA), these controversial examples can compose over 30% of the originally annotated data. We argue that the removal of such data is a problematic trend because, when performing real-time sentiment classification of short-text, an automated system cannot know a priori which samples would fall into this category of disputed sentiment. We therefore propose the notion of a "complicated" class of sentiment to categorize such text, and argue that its inclusion in the short-text sentiment analysis framework will improve the quality of automated sentiment analysis systems as they are implemented in real-world settings. We motivate this argument by building and analyzing a new publicly available TSA dataset of over 7,000 tweets annotated with 5x coverage, named MTSA. Our analysis of classifier performance over our dataset offers insights into sentiment analysis dataset and model design, how current techniques would perform in the real world, and how researchers should handle difficult data.

show abstract

Section: Summary Of Tsa Problemsmentioning

confidence: 97%

Section: Current Problems In Tsamentioning

confidence: 99%

Section: Current Problems In Tsamentioning

confidence: 99%

See 1 more Smart Citation

Sentiment Analysis: It’s Complicated!

Kenyon-Dean

Ahmed²,

Fujimoto

et al. 2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

View full text Add to dashboard Cite

show abstract

“…For instance, in [20] and [26], sentiment analysis is used in order to evaluate short texts coming from Twitter and other resources. On [28], several multimodal sentiment analysis methods are reviewed.…”

Section: Sentiment Analysismentioning

confidence: 99%

Analysing discussions in social networks using group decision making methods and sentiment analysis

Morente-Molinera

Kou

Peng

et al. 2018

Information Sciences

View full text Add to dashboard Cite

Esta es la versión de autor del artículo publicado en: This is an author produced version of a paper published in:Information Sciences 447 (2018) AbstractSocial networks are one of the most preferred environments for people to carry out debates. Due to the fact that a high amount of people can participate in the process, there is a need of tools that can analyse these discussions and extract useful information from them. In this paper, a novel way of determining how the debate is going on, if there is consensus among the participants and which alternatives are preferred is presented. Sentiment analysis is used in order to measure the level of preference that social media users have about a certain set of alternatives. In order to test the presented scheme, a real application example that makes use of Twitter information is presented.

show abstract

“…The simplest and also the most popular task of sentiment analysis on Twitter is to determine the overall sentiment expressed by the author of a tweet [38,39,40,55,56]. Typically, this means choosing one of the following three classes to describe the sentiment: POSITIVE, NEGATIVE, and NEUTRAL.…”

Section: Variants Of the Task At Semevalmentioning

confidence: 99%

Semantic Sentiment Analysis of Twitter Data

Nakov¹

2017

Encyclopedia of Social Network Analysis and Mining

Self Cite

View full text Add to dashboard Cite

Microblog sentiment analysis; Twitter opinion mining 2 GlossarySentiment Analysis: This is text analysis aiming to determine the attitude of a speaker or a writer with respect to some topic or the overall contextual polarity of a piece of text. DefinitionSentiment analysis on Twitter is the use of natural language processing techniques to identify and categorize opinions expressed in a tweet, in order to determine the author's attitude toward a particular topic or in general. Typically, discrete labels such as positive, negative, neutral, and objective are used for this purpose, but it is also possible to use labels on an ordinal scale, or even continuous numerical values.

show abstract

Developing a successful SemEval task in sentiment analysis of Twitter and other social media texts

Cited by 87 publications

References 46 publications

Sentiment Analysis: It’s Complicated!

Sentiment Analysis: It’s Complicated!

Analysing discussions in social networks using group decision making methods and sentiment analysis

Semantic Sentiment Analysis of Twitter Data

Contact Info

Product

Resources

About