Gretel Liz De la Peña Sarracén scite author profile

This paper describes the system we developed for EVALITA 2018, the 6th evaluation campaign of Natural Language Processing and Speech tools for Italian, on Hate Speech Detection (HaSpeeDe). The task consists in automatically annotating Italian messages from two popular micro-blogging platforms, Twitter and Facebook, with a boolean value indicating the presence or not of hate speech. We propose an Attention-based in Long Short-Term Memory Recurrent Neural Network where the attention layer helps to calculate the contribution of each part of the text towards targeted hateful messages.

Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection

Bevendorff

Chulvi

et al. 2021

The paper gives a brief overview of the three shared tasks to be organized at the PAN 2021 lab on digital text forensics and stylometry hosted at the CLEF conference. The tasks include authorship verification across domains, author profiling for hate speech spreaders, and style change detection for multiauthor documents. In part the tasks are new and in part they continue and advance past shared tasks, with the overall goal of advancing the state of the art, providing for an objective evaluation on newly developed benchmark datasets.

Offensive keyword extraction based on the attention mechanism of BERT and the eigenvector centrality using a graph representation

Rosso

2021

Pers Ubiquit Comput

The proliferation of harmful content on social media affects a large part of the user community. Therefore, several approaches have emerged to control this phenomenon automatically. However, this is still a quite challenging task. In this paper, we explore the offensive language as a particular case of harmful content and focus our study in the analysis of keywords in available datasets composed of offensive tweets. Thus, we aim to identify relevant words in those datasets and analyze how they can affect model learning. For keyword extraction, we propose an unsupervised hybrid approach which combines the multi-head self-attention of BERT and a reasoning on a word graph. The attention mechanism allows to capture relationships among words in a context, while a language model is learned. Then, the relationships are used to generate a graph from what we identify the most relevant words by using the eigenvector centrality. Experiments were performed by means of two mechanisms. On the one hand, we used an information retrieval system to evaluate the impact of the keywords in recovering offensive tweets from a dataset. On the other hand, we evaluated a keyword-based model for offensive language detection. Results highlight some points to consider when training models with available datasets.

Overview of PAN 2021: Authorship Verification, Profiling Hate Speech Spreaders on Twitter, and Style Change Detection

Bevendorff

Chulvi

et al. 2021

PRHLT-UPV at SemEval-2020 Task 8: Study of Multimodal Techniques for Memes Analysis

Rosso

Giachanou

2020

This paper describes the system submitted by the PRHLT-UPV team for the task 8 of SemEval-2020: Memotion Analysis. We propose a multimodal model that combines pretrained models of the BERT and VGG architectures. The BERT model is used to process the textual information and VGG the images. The multimodal model is used to classify memes according to the presence of offensive, sarcastic, humorous and motivating content. Also, a sentiment analysis of memes is carried out with the proposed model. In the experiments, the model is compared with other approaches to analyze the relevance of the multimodal model. The results show encouraging performances on the final leaderboard of the competition, reaching good positions in the ranking of systems.