Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation

Dinan, Emily; Fan, Angela; Williams, Adina; Urbanek, Jack; Kiela, Douwe; Weston, Jason

doi:10.18653/v1/2020.emnlp-main.656

Cited by 104 publications

(105 citation statements)

References 64 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For dialogue, gender biases in training corpora have been found to be amplified in machine learning models Dinan et al, 2020;Liu et al, 2019). While many of the works cited above proposed methods of mitigating the unwanted effects of gender on text, Hall Maudslay et al ( 2019), Liu et al (2019), Zmigrod et al (2019), andDinan et al (2020) in particular relied on counterfactual data to alter the training distribution to offset gender-based statistical imbalances (see §4.2 for more discussion of training set imbalances). Also relevant is Kang et al (2019, PASTEL), which introduced a parallel style corpus and showed gains on style-transfer across binary genders.…”

Section: Related Workmentioning

confidence: 99%

“…By learning to associate control variables with textual properties, generative models can be controlled at inference time to adjust the generated text based on the desired properties of the user. This has been applied to a variety of different cases, including generating text of different lengths (Fan et al, 2018a), generating questions in chit-chat (See et al, 2019), and reducing bias (Dinan et al, 2020).…”

Section: Controllable Generationmentioning

confidence: 99%

“…ited in coverage and applicability to a variety of domains (Dinan et al, 2020). However, by decomposing bias along the TO, AS, and ABOUT dimensions, fine-grained control models can be trained to control these different dimensions separately.…”

Section: Controllable Generationmentioning

confidence: 99%

“…We also compare to a baseline for which the control tokens are determined by a word list: if an utterance contains more masculine-gendered words than feminine-gendered words from the word list it is labeled as masculine (and vice versa for feminine); if it contains no gendered words or an equal number of masculine and feminine gendered words, it is labeled as neutral. Following Dinan et al (2020), we combine several existing word lists (Zhao et al, 2018bHoyle et al, 2019).…”

Section: Controllable Generationmentioning

confidence: 99%

“…In particular, models often can unwittingly learn negative associations about protected groups present in their training data and propagate them. In particular, NLP models often learn to replicate unwanted gender biases present in society (Bolukbasi et al, 2016;Hovy and Spruit, 2016;Caliskan et al, 2017;Rudinger et al, 2017;Garg et al, 2018;Dinan et al, 2020). Since unwanted gender biases can affect downstream applications-sometimes even * Joint first authors.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Multi-Dimensional Gender Bias Classification

Dinan¹,

Fan²,

Wu³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

Self Cite

View full text Add to dashboard Cite

Machine learning models are trained to find patterns in data. NLP models can inadvertently learn socially undesirable patterns when training on gender biased text. In this work, we propose a novel, general framework that decomposes gender bias in text along several pragmatic and semantic dimensions: bias from the gender of the person being spoken about, bias from the gender of the person being spoken to, and bias from the gender of the speaker. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information. In addition, we collect a new, crowdsourced evaluation benchmark. Distinguishing between gender bias along multiple dimensions enables us to train better and more fine-grained gender bias classifiers. We show our classifiers are valuable for a variety of applications, like controlling for gender bias in generative models, detecting gender bias in arbitrary text, and classifying text as offensive based on its genderedness.

show abstract

Section: Related Workmentioning

confidence: 99%