Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu 2018
DOI: 10.18653/v1/n18-2002
|View full text |Cite
|
Sign up to set email alerts
|

Gender Bias in Coreference Resolution

Abstract: We present an empirical study of gender bias in coreference resolution systems. We first introduce a novel, Winograd schema-style set of minimal pair sentences that differ only by pronoun gender. With these Winogender schemas, we evaluate and confirm systematic gender bias in three publicly-available coreference resolution systems, and correlate this bias with real-world and textual gender statistics.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

4
338
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 324 publications
(373 citation statements)
references
References 18 publications
4
338
0
Order By: Relevance
“…The results prove that BERT expresses strong preferences for male pronouns, raising concerns with using BERT in downstream tasks like resume filtering. Table 5: Percentage of attributes associated more strongly with the male gender 6 Related Work NLP applications ranging from core tasks such as coreference resolution (Rudinger et al, 2018) and language identification (Jurgens et al, 2017), to downstream systems such as automated essay scoring (Amorim et al, 2018), exhibit inherent social biases which are attributed to the datasets used to train the embeddings (Barocas and Selbst, 2016;Zhao et al, 2017;Yao and Huang, 2017).…”
Section: Real World Implicationsmentioning
confidence: 99%
“…The results prove that BERT expresses strong preferences for male pronouns, raising concerns with using BERT in downstream tasks like resume filtering. Table 5: Percentage of attributes associated more strongly with the male gender 6 Related Work NLP applications ranging from core tasks such as coreference resolution (Rudinger et al, 2018) and language identification (Jurgens et al, 2017), to downstream systems such as automated essay scoring (Amorim et al, 2018), exhibit inherent social biases which are attributed to the datasets used to train the embeddings (Barocas and Selbst, 2016;Zhao et al, 2017;Yao and Huang, 2017).…”
Section: Real World Implicationsmentioning
confidence: 99%
“…For coreference resolution, Rudinger et al (2018) and Zhao et al (2018b) independently designed GBETs based on Winograd Schemas. The corpus consists of sentences which contain a gender-neutral occupation (e.g., doctor), a secondary participant (e.g., patient), and a gendered pronoun that refers either the occupation or the participant.…”
Section: Taskmentioning
confidence: 99%
“…If that same model predicts females and males coreferent to "doctor" with 20% and 60% accuracy, respectively, then the global average accuracy for each gender is equivalent, yet the model exhibits bias. 1 Therefore, Zhao et al (2018b) and Rudinger et al (2018) design metrics to analyze gender bias by examining how the performance difference between genders with respect to each occupation correlate with the occupational gender statistics from the U.S Bureau of Labor Statistics.…”
Section: Taskmentioning
confidence: 99%
“…This results in gender-stereotypical vector analogiesà la Mikolov et al (2013), such as man:computer programmer :: woman:homemaker (Bolukbasi et al, 2016), and such bias has been shown to materialise in a variety of downstream tasks, e.g. coreference resolution (Rudinger et al, 2018;Zhao et al, 2018).…”
Section: Introductionmentioning
confidence: 99%