Proceedings of the Seventh Workshop on Noisy User-Generated Text (W-Nut 2021) 2021
DOI: 10.18653/v1/2021.wnut-1.35
|View full text |Cite
|
Sign up to set email alerts
|

Detecting Cross-Geographic Biases in Toxicity Modeling on Social Media

Abstract: Online social media platforms increasingly rely on Natural Language Processing (NLP) techniques to detect abusive content at scale in order to mitigate the harms it causes to their users. However, these techniques suffer from various sampling and association biases present in training data, often resulting in sub-par performance on content relevant to marginalized groups, potentially furthering disproportionate harms towards them. Studies on such biases so far have focused on only a handful of axes of disparit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
21
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 19 publications
(21 citation statements)
references
References 37 publications
0
21
0
Order By: Relevance
“…However, the (finite set of) identity-related and offensive tokens considered in this work are all in English and centered around Western cultural context. We leave the evaluation of our methodology to assess whether there are language-or more broadly culture-dependent changes for future work, following recent work on biases in geo-cultural contexts (Ghosh et al, 2021).…”
Section: Ethical Considerationsmentioning
confidence: 99%
“…However, the (finite set of) identity-related and offensive tokens considered in this work are all in English and centered around Western cultural context. We leave the evaluation of our methodology to assess whether there are language-or more broadly culture-dependent changes for future work, following recent work on biases in geo-cultural contexts (Ghosh et al, 2021).…”
Section: Ethical Considerationsmentioning
confidence: 99%
“…Ghosh et al [33] noted that a cross-geographical/cultural application of toxicity detectors can lead to lexical bias. They noted that majority of the literature focuses on the English language and the geo-cultural scenarios of a handful of countries.…”
Section: Cross-geographic Biasmentioning
confidence: 99%
“…As seen in Section 5, this false positive bias can be explained through the over-representation of specific terms in the toxic class of the training dataset. Based on the above observations, Ghosh et al [33] then proposed a two-step weakly-supervised method to detect lexical bias for cross-geocultural toxic content. They carried out this analysis using unlabeled tweets collected from across seven countries.…”
Section: Cross-geographic Biasmentioning
confidence: 99%
See 2 more Smart Citations