Darkness can not drive out darkness: Investigating Bias in Hate SpeechDetection Models

Elsafoury, Fatma

doi:10.18653/v1/2022.acl-srw.4

Cited by 5 publications

(6 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Huang et al (2020) argued that whether a statement is considered hate speech depends largely on who the speaker is. Elsafoury (2022) investigated the causal effect of the social and intersectional bias on the performance and unfairness of hate speech detection models. Therefore, some debiasing methods for this task have also been proposed.…”

Section: Monolingual Text Classification and Fairness Researchmentioning

confidence: 99%

Model and Evaluation: Towards Fairness in Multilingual Text Classification

Lin¹,

He²,

Tang³

et al. 2023

Preprint

View full text Add to dashboard Cite

Recently, more and more research has focused on addressing bias in text classification models. However, existing research mainly focuses on the fairness of monolingual text classification models, and research on fairness for multilingual text classification is still very limited. In this paper, we focus on the task of multilingual text classification and propose a debiasing framework for multilingual text classification based on contrastive learning. Our proposed method does not rely on any external language resources and can be extended to any other languages. The model contains four modules: multilingual text representation module, language fusion module, text debiasing module, and text classification module. The multilingual text representation module uses a multilingual pre-trained language model to represent the text, the language fusion module makes the semantic spaces of different languages tend to be consistent through contrastive learning, and the text debiasing ©2000 Marina Meilȃ and Michael I. Jordan.

show abstract

Section: Monolingual Text Classification and Fairness Researchmentioning

confidence: 99%

Model and Evaluation: Towards Fairness in Multilingual Text Classification

Lin¹,

He²,

Tang³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…I introduce the systematic offensive stereotyping (SOS) bias and formally define it as "A systematic association in the word embeddings between profanity and marginalized groups of people." (Elsafoury, 2022). I propose a method to measure it and validate it in static (Elsafoury et al, 2022a) and contextual word embeddings (Elsafoury et al, 2022a).…”

Section: The Offensive Stereotyping Bias Perspectivementioning

confidence: 99%

Proceedings of the Big Picture Workshop

2023

View full text Add to dashboard Cite

A key contribution to being a successful researcher in natural language processing, as in any area, is having a clear overarching vision of what your body of research is trying to accomplish. Using my own 40-year career as an example, I will attempt to provide general advice on formulating and pursuing a coherent research vision. In particular, I will focus on formulating a unique, personal objective that exploits your specific talents, knowledge, and passions, and that is distinct from the current popular trends in the field. I will also focus on formulating a vision that bridges existing fields of study to produce an overarching agenda that unifies previously disparate ideas.

show abstract

Section: Contributionsmentioning

confidence: 99%

Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection

Elsafoury

2023

Proceedings of the Big Picture Workshop

View full text Add to dashboard Cite

This paper is a summary of the work done in my PhD thesis. Where I investigate the impact of bias in NLP models on the task of hate speech detection from three perspectives: explainability, offensive stereotyping bias, and fairness. Then, I discuss the main takeaways from my thesis and how they can benefit the broader NLP community. Finally, I discuss important future research directions. The findings of my thesis suggest that the bias in NLP models impacts the task of hate speech detection from all three perspectives. And that unless we start incorporating social sciences in studying bias in NLP models, we will not effectively overcome the current limitations of measuring and mitigating bias in NLP models.

show abstract

Darkness can not drive out darkness: Investigating Bias in Hate SpeechDetection Models

Cited by 5 publications

References 42 publications

Model and Evaluation: Towards Fairness in Multilingual Text Classification

Model and Evaluation: Towards Fairness in Multilingual Text Classification

Proceedings of the Big Picture Workshop

Thesis Distillation: Investigating The Impact of Bias in NLP Models on Hate Speech Detection

Contact Info

Product

Resources

About