Impact of Politically Biased Data on Hate Speech Classification

Wich, Maximilian; Bauer, Jan Michael; Groh, Georg

doi:10.18653/v1/2020.alw-1.7

Cited by 41 publications

(31 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since unintended bias in hate speech datasets can impair the model's performance (Waseem, 2016) and fairness (Vidgen et al, 2019a;Dixon et al, 2018), a lot of recent work has been done to investigate this phenomenon (Wiegand et al, 2019;Kim et al, 2020). Some work examined racial bias (Sap et al, 2019;Davidson et al, 2019;Xia et al, 2020), others explored gender bias (Gold and Zesch, 2018), aggregation bias (Balayn et al, 2018) and political bias (Wich et al, 2020b). The type of bias we are examining in this study is the annotator bias.…”

Section: Related Workmentioning

confidence: 97%

Identifying and Measuring Annotator Bias Based on Annotators’ Demographic Characteristics

Kuwatly¹,

Wich²,

Groh³

2020

Proceedings of the Fourth Workshop on Online Abuse and Harms

Self Cite

View full text Add to dashboard Cite

Machine learning is recently used to detect hate speech and other forms of abusive language in online platforms. However, a notable weakness of machine learning models is their vulnerability to bias, which can impair their performance and fairness. One type is annotator bias caused by the subjective perception of the annotators. In this work, we investigate annotator bias using classification models trained on data from demographically distinct annotator groups. To do so, we sample balanced subsets of data that are labeled by demographically distinct annotators. We then train classifiers on these subsets, analyze their performances on similarly grouped test sets, and compare them statistically. Our findings show that the proposed approach successfully identifies bias and that demographic features, such as first language, age, and education, correlate with significant performance differences.

show abstract

Section: Related Workmentioning

confidence: 97%

Identifying and Measuring Annotator Bias Based on Annotators’ Demographic Characteristics

Kuwatly¹,

Wich²,

Groh³

2020

Proceedings of the Fourth Workshop on Online Abuse and Harms

Self Cite

View full text Add to dashboard Cite

show abstract

“…The authors leverage attention scores to quantify the relevance of different input features. Wich et al (2020) applies posthoc explainability on a custom dataset in German to expose and estimate the impact of political bias on hate speech classifiers. More in detail, left-and right-wing political bias within the training data is visualized via DeepSHAP-based explanations (Lundberg and Lee, 2017).…”

Section: Explainability For Recognition Modelsmentioning

confidence: 99%

Understanding and Interpreting the Impact of User Context in Hate Speech Detection

Mosca¹,

Wich²,

Groh³

2021

Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media

Self Cite

View full text Add to dashboard Cite

As hate speech spreads on social media and online communities, research continues to work on its automatic detection. Recently, recognition performance has been increasing thanks to advances in deep learning and the integration of user features. This work investigates the effects that such features can have on a detection model. Unlike previous research, we show that simple performance comparison does not expose the full impact of including contextualand user information. By leveraging explainability techniques, we show (1) that user features play a role in the model's decision and (2) how they affect the feature space learned by the model. Besides revealing that-and also illustrating why-user features are the reason for performance gains, we show how such techniques can be combined to better understand the model and to detect unintended bias.

show abstract

“…[9] reported problems with the association of minority group language with hate in their data, while [47] have done work on the influence of different biases in the sampling of popular abusive language datasets (e.g., topic and author bias). [46] analyzed how political bias influence hate speech classification models. [37] proposed social bias frames, which is a formalism that "aims to model the pragmatic frames in which people project social biases and stereotypes onto others" [37, p. 1].…”

Section: Related Workmentioning

confidence: 99%

Bias and comparison framework for abusive language datasets

et al. 2021

Self Cite

View full text Add to dashboard Cite

Recently, numerous datasets have been produced as research activities in the field of automatic detection of abusive language or hate speech have increased. A problem with this diversity is that they often differ, among other things, in context, platform, sampling process, collection strategy, and labeling schema. There have been surveys on these datasets, but they compare the datasets only superficially. Therefore, we developed a bias and comparison framework for abusive language datasets for their in-depth analysis and to provide a comparison of five English and six Arabic datasets. We make this framework available to researchers and data scientists who work with such datasets to be aware of the properties of the datasets and consider them in their work.

show abstract

Impact of Politically Biased Data on Hate Speech Classification

Cited by 41 publications

References 29 publications

Identifying and Measuring Annotator Bias Based on Annotators’ Demographic Characteristics

Identifying and Measuring Annotator Bias Based on Annotators’ Demographic Characteristics

Understanding and Interpreting the Impact of User Context in Hate Speech Detection

Bias and comparison framework for abusive language datasets

Contact Info

Product

Resources

About