Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection

Ramponi, Alan; Tonelli, Sara

doi:10.18653/v1/2022.naacl-main.221

Cited by 10 publications

(24 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A thorough analysis on the impact of diverse auxiliary tasks on the performance of our models for PCL detection, and an investigation on the role of uncertainty and disagreement further confirmed the importance of considering annotators' point of view in PCL detection. As future work, we aim to test the presence and assess the impact of spurious lexical biases in the dataset (Ramponi and Tonelli, 2022) and extend our models to other genres, such as social media (Wang and Potts, 2019). We hope this work will encourage future efforts towards annotatorscentric NLP, on PCL detection and other subjective tasks more broadly.…”

Section: Discussionmentioning

confidence: 99%

DH-FBK at SemEval-2022 Task 4: Leveraging Annotators’ Disagreement and Multiple Data Views for Patronizing Language Detection

Ramponi¹,

Leonardelli²

2022

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Self Cite

View full text Add to dashboard Cite

The subtle and typically unconscious use of patronizing and condescending language (PCL) in large-audience media outlets undesirably feeds stereotypes and strengthens power-knowledge relationships, perpetuating discrimination towards vulnerable communities. Due to its subjective and subtle nature, PCL detection is an open and challenging problem, both for computational methods and human annotators. In this paper we describe the systems submitted by the DH-FBK team to SemEval-2022 Task 4, aiming at detecting PCL towards vulnerable communities in English media texts. Motivated by the subjectivity of human interpretation, we propose to leverage annotators' uncertainty and disagreement to better capture the shades of PCL in a multi-task, multi-view learning framework. Our approach achieves competitive results, largely outperforming baselines and ranking on the top-left side of the leaderboard on both PCL identification and classification. Noticeably, our approach does not rely on any external data or model ensemble, making it a viable and attractive solution for real-world use.

show abstract

Section: Discussionmentioning

confidence: 99%

DH-FBK at SemEval-2022 Task 4: Leveraging Annotators’ Disagreement and Multiple Data Views for Patronizing Language Detection

Ramponi¹,

Leonardelli²

2022

Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)

Self Cite

View full text Add to dashboard Cite

show abstract

“…• group : the target of the abuse is a protected group or an individual as part of that group. We follow the widely-used definition of protected groups ( Röttger et al, 2021 ; Ramponi & Tonelli, 2022 ; Banko, MacKeen & Ray, 2020 ), namely groups based on characteristics such as religion, ethnicity, race, gender identity, age, sex or sexual orientation, disability, and national origins. The category is related to Davidson et al (2017) ’s “hate speech” definition and focus on protected characteristics.…”

Section: A Taxonomy For Religious Hatementioning

confidence: 99%

“…In this section, we present the protocol we followed for collecting and annotating religious hate speech data in English and Italian. We then provide documentation in the form of data and artifacts statements ( Bender & Friedman, 2018 ; Ramponi & Tonelli, 2022 ), as well as summary statistics and insights about the annotated corpus. While data collection follows the same protocol for both languages, we adopt two different approaches to data annotation.…”

Section: Dataset Creationmentioning

confidence: 99%

“…We then detail the creation of a Twitter dataset containing two subsets to study religious hate, one in English and one in Italian. We describe the data annotation process and provide related documentation in the form of data and artifact statements ( Bender & Friedman, 2018 ; Ramponi & Tonelli, 2022 ). Finally, we present monolingual and cross-lingual classification experiments on Italian and English data subsets for two tasks: abusive language detection and religious hate speech identification, followed by a quantitative and qualitative analysis of religious hate speech forms across languages and religions, a discussion and our conclusions.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Addressing religious hate online: from taxonomy creation to automated detection

Ramponi

Testa

Tonelli

et al. 2022

PeerJ Computer Science

View full text Add to dashboard Cite

Abusive language in online social media is a pervasive and harmful phenomenon which calls for automatic computational approaches to be successfully contained. Previous studies have introduced corpora and natural language processing approaches for specific kinds of online abuse, mainly focusing on misogyny and racism. A current underexplored area in this context is religious hate, for which efforts in data and methods to date have been rather scattered. This is exacerbated by different annotation schemes that available datasets use, which inevitably lead to poor repurposing of data in wider contexts. Furthermore, religious hate is very much dependent on country-specific factors, including the presence and visibility of religious minorities, societal issues, historical background, and current political decisions. Motivated by the lack of annotated data specifically tailoring religion and the poor interoperability of current datasets, in this article we propose a fine-grained labeling scheme for religious hate speech detection. Such scheme lies on a wider and highly-interoperable taxonomy of abusive language, and covers the three main monotheistic religions: Judaism, Christianity and Islam. Moreover, we introduce a Twitter dataset in two languages—English and Italian—that has been annotated following the proposed annotation scheme. We experiment with several classification algorithms on the annotated dataset, from traditional machine learning classifiers to recent transformer-based language models, assessing the difficulty of two tasks: abusive language detection and religious hate speech detection. Finally, we investigate the cross-lingual transferability of multilingual models on the tasks, shedding light on the viability of repurposing our dataset for religious hate speech detection on low-resource languages. We release the annotated data and publicly distribute the code for our classification experiments at https://github.com/dhfbk/religious-hate-speech.

show abstract

“…Our research hypothesis is that the labels given by individuals with a low CRT score may be noisy and biased. To test the hypothesis and support the effectiveness of the dataset, we conducted experiments with varying architectures and metrics that cover human-centered desiderata of hate speech classifiers, such as detection performance, fairness (Ramponi and Tonelli, 2022), and explainability (Mathew et al, 2021).…”

Section: Introductionmentioning

confidence: 99%

Acceptance and Metamorphosis of New Culture at Soongsil School in Pyongyang during the Modern Transition Period - Introduction and Spread of Modern Western Music and Sports -

Park¹,

부교수²

2020

The Historical Association for Soong-Sil

View full text Add to dashboard Cite

Numerous datasets have been proposed to combat the spread of online hate. Despite these efforts, a majority of these resources are Englishcentric, primarily focusing on overt forms of hate. This research gap calls for developing high-quality corpora in diverse languages that also encapsulate more subtle hate expressions. This study introduces K-HATERS, a new corpus for hate speech detection in Korean, comprising approximately 192K news comments with target-specific offensiveness ratings. This resource is the largest offensive language corpus in Korean and is the first to offer targetspecific ratings on a three-point Likert scale, enabling the detection of hate expressions in Korean across varying degrees of offensiveness. We conduct experiments showing the effectiveness of the proposed corpus, including a comparison with existing datasets. Additionally, to address potential noise and bias in human annotations, we explore a novel idea of adopting the Cognitive Reflection Test, which is widely used in social science for assessing an individual's cognitive ability, as a proxy of labeling quality. Findings indicate that annotations from individuals with the lowest test scores tend to yield detection models that make biased predictions toward specific target groups and are less accurate. This study contributes to the NLP research on hate speech detection and resource construction. The code and dataset can be accessed at https://github. com/ssu-humane/K-HATERS.

show abstract

Features or Spurious Artifacts? Data-centric Baselines for Fair and Robust Hate Speech Detection

Cited by 10 publications

References 28 publications

DH-FBK at SemEval-2022 Task 4: Leveraging Annotators’ Disagreement and Multiple Data Views for Patronizing Language Detection

DH-FBK at SemEval-2022 Task 4: Leveraging Annotators’ Disagreement and Multiple Data Views for Patronizing Language Detection

Addressing religious hate online: from taxonomy creation to automated detection

Acceptance and Metamorphosis of New Culture at Soongsil School in Pyongyang during the Modern Transition Period - Introduction and Spread of Modern Western Music and Sports -

Contact Info

Product

Resources

About