As people increasingly use emoticons in text in order to express, stress, or disambiguate their sentiment, it is crucial for automated sentiment analysis tools to correctly account for such graphical cues for sentiment. We analyze how emoticons typically convey sentiment and demonstrate how we can exploit this by using a novel, manually created emoticon sentiment lexicon in order to improve a state-of-the-art lexicon-based sentiment classification method. We evaluate our approach on 2,080 Dutch tweets and forum messages, which all contain emoticons and have been manually annotated for sentiment. On this corpus, paragraph-level accounting for sentiment implied by emoticons significantly improves sentiment classification accuracy. This indicates that whenever emoticons are used, their associated sentiment dominates the sentiment conveyed by textual cues and forms a good proxy for intended sentiment.
As consumers nowadays generate increasingly more content describing their experiences with, e.g., products and brands in various languages, information systems monitoring a universal, languageindependent measure of people's intended sentiment are crucial for today's businesses. In order to facilitate sentiment analysis of user-generated content, we propose to map sentiment conveyed by unstructured natural language text to universal star ratings, capturing intended sentiment. For these mappings, we consider a monotonically increasing step function, a naïve Bayes method, and a support vector machine. We demonstrate that the way in which natural language reveals intended sentiment differs across our data sets of Dutch and English texts. Additionally, the results of our experiments on modelling the relation between conveyed sentiment and intended sentiment suggest that language-specific sentiment scores can separate universal classes of intended sentiment from one another to a limited extent.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.