Posts on Twitter allow users to express ideas and opinions in a very dynamic way. The volume of data available is incredibly high in this support, and it may provide relevant clues regarding the judgement of the public on certain product, event, service etc. While in standard sentiment analysis the most common task is to classify the utterances according to their polarity, it is clear that detecting ironic senses represent a big challenge for Natural Language Processing. By observing a corpus constitued by tweets, we propose a set of patterns that might suggest ironic/sarcastic statements. Thus, we developed special clues for irony detection, through the implementation and evaluation of a set of patterns.
This paper presents the algorithm Polarity Recognizer in Portuguese (PIRPO) to classify sentiment in online reviews. PIRPO was constructed to identify polarity in Portuguese user generated accommodation reviews. Each review is analysed according to concepts from a domain ontology. We decompose the review in sentences in order to assign a polarity to each concept of the ontology in the sentence. Preliminary results indicate an average F-score of 0.32 for polarity recognition.
Sentiment Analysis is the computer science field that comprises techniques that aim to automatically extract opinions from texts. Usually, these techniques assign a Sentiment Orientation to the whole document (Document Level Sentiment Analysis). But a document can express sentiment about several aspects of an entity. Methods that extract those aspects, paired with the sentiment about them, operate in the Aspect Level. Aspect-Based Sentiment Analysis approaches can be split into two stages: Aspect Extraction and Aspect Sentiment Classification. The literature presents works mainly focused on reviews about hotels, smartphones, or restaurants. In this work, we present an approach for Aspect Extraction based on Multilingual (Google's) and Portuguese (BERTimbau) BERT pre-trained models. Our experiments show that Aspect Extraction based on BERT pre-trained for Portuguese achieved Balanced Accuracy of up to 93% on a corpus of reviews about the accommodation sector.
Hate speech is a language that attacks or denigrates a specific group based on their characteristics, such as their race, ethnicity, or sexual orientation. Hate speech became widespread and spread through social networks, blogs, videos, and other communication channels. With anonymity and a sense of impunity, people feel encouraged to spread their hatred on the internet. In this work, we used the BERT model for the Portuguese language called BERTimbau to classify hate speech in three datasets in Portuguese, available in the literature: OFFCOMBR-2, OFFCOMBR-3, and Fortuna et. al. (2019) dataset. Still, we performed some preprocessing and an oversampling technique on the datasets. Finally, we compared the results obtained with results obtained by works available in the literature. Experiments with BERTimbau, using preprocessing and oversampling obtained better results than other classification techniques.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.