Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track 2023
DOI: 10.18653/v1/2023.emnlp-industry.26
|View full text |Cite
|
Sign up to set email alerts
|

Unveiling Identity Biases in Toxicity Detection : A Game-Focused Dataset and Reactivity Analysis Approach

Josiane Van Dorpe,
Zachary Yang,
Nicolas Grenon-Godbout
et al.

Abstract: Identity biases arise commonly from annotated datasets, can be propagated in language models and can cause further harm to marginal groups. Existing bias benchmarking datasets are mainly focused on gender or racial biases and are made to pinpoint which class the model is biased towards. They also are not designed for the gaming industry, a concern for models built for toxicity detection in videogames' chat. We propose a dataset and a method to highlight oversensitive terms using reactivity analysis and the mod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 21 publications
0
3
0
Order By: Relevance
“…Post-training assessment necessitates the exploration of various methodologies to analyse potential biases within the model. In Van Dorpe et al [204], a set of templates was devised to scrutinize the impact of protected group presence or absence in toxic detection. This analysis includes the calculation of a reactivity score, determined by assessing the average predictive difference across all sentence templates.…”
Section: Biases In Toxicity Detectionmentioning
confidence: 99%
See 2 more Smart Citations
“…Post-training assessment necessitates the exploration of various methodologies to analyse potential biases within the model. In Van Dorpe et al [204], a set of templates was devised to scrutinize the impact of protected group presence or absence in toxic detection. This analysis includes the calculation of a reactivity score, determined by assessing the average predictive difference across all sentence templates.…”
Section: Biases In Toxicity Detectionmentioning
confidence: 99%
“…Despite the primary focus of this review not being centred on mitigating toxic detection bias, we have observed evaluation metrics [49,146,147], bias analysis methodologies [117,204,212], and bias mitigation techniques [45,88,96,112,134,139,209,221] that are cornerstones to the improvement of these models. Biases in toxic detection are not only embedded during the training phase but also inherent in the base models [136,147], resulting in the exacerbation of these biases and making it harder to assess and mitigate them after training.…”
Section: Biasmentioning
confidence: 99%
See 1 more Smart Citation