Gender-preserving Debiasing for Pre-trained Word Embeddings

Kaneko, Masahiro; Bollegala, Danushka

doi:10.18653/v1/p19-1160

Cited by 94 publications

(106 citation statements)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Gender affects myriad aspects of NLP, including corpora, tasks, algorithms, and systems Costa-jussà, 2019;Sun et al, 2019). For example, statistical gender biases are rampant in word embeddings (Jurgens et al, 2012;Bolukbasi et al, 2016;Caliskan et al, 2017;Garg et al, 2018;Zhao et al, 2018b;Basta et al, 2019;Chaloner and Maldonado, 2019;Du et al, 2019;Ethayarajh et al, 2019;Kaneko and Bollegala, 2019;Kurita et al, 2019;-including multilingual ones (Escudé Font and Costa-jussà, 2019;Zhou et al, 2019)-and affect a wide range of downstream tasks including coreference resolution (Zhao et al, 2018a;Cao and Daumé III, 2020;Emami et al, 2019), part-ofspeech and dependency parsing (Garimella et al, 2019), language modeling (Qian et al, 2019;Nangia et al, 2020), appropriate turn-taking classification (Lepp, 2019), relation extraction (Gaut et al, 2020), identification of offensive content (Sharifirad and Matwin, 2019;, and machine translation (Stanovsky et al, 2019;Hovy et al, 2020).…”

Section: Related Workmentioning

confidence: 99%

Multi-Dimensional Gender Bias Classification

Dinan¹,

Fan²,

Wu³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Machine learning models are trained to find patterns in data. NLP models can inadvertently learn socially undesirable patterns when training on gender biased text. In this work, we propose a novel, general framework that decomposes gender bias in text along several pragmatic and semantic dimensions: bias from the gender of the person being spoken about, bias from the gender of the person being spoken to, and bias from the gender of the speaker. Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information. In addition, we collect a new, crowdsourced evaluation benchmark. Distinguishing between gender bias along multiple dimensions enables us to train better and more fine-grained gender bias classifiers. We show our classifiers are valuable for a variety of applications, like controlling for gender bias in generative models, detecting gender bias in arbitrary text, and classifying text as offensive based on its genderedness.

show abstract

Section: Related Workmentioning

confidence: 99%

Multi-Dimensional Gender Bias Classification

Dinan¹,

Fan²,

Wu³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

show abstract

“…This post-processing operation has been repeatedly proposed in different contexts such as with distributional (counting-based) word representations (Sahlgren et al, 2016) and sentence embeddings (Arora et al, 2017). Independently to the above, autoencoders have been widely used for fine-tuning pre-trained word embeddings such as for removing gender bias (Kaneko and Bollegala, 2019), meta-embedding (Bao and Bollegala, 2018), cross-lingual word embedding (Wei and Deng, 2017) and domain adaptation (Chen et al, 2012), to name a few. However, it is unclear whether better performance is obtained simply by applying an autoencoder (a self-supervised task, requiring no labelled data) on pre-trained word embeddings, without performing any task-specific fine-tuning (requires labelled data for the task).…”

Section: Introductionmentioning

confidence: 99%

Autoencoding Improves Pre-trained Word Embeddings

Kaneko

Bollegala

2020

Proceedings of the 28th International Conference on Computational Linguistics

Self Cite

View full text Add to dashboard Cite

Prior work investigating the geometry of pre-trained word embeddings have shown that word embeddings to be distributed in a narrow cone and by centering and projecting using principal component vectors one can increase the accuracy of a given set of pre-trained word embeddings. However, theoretically this post-processing step is equivalent to applying a linear autoencoder to minimise the squared 2 reconstruction error. This result contradicts prior work (Mu and Viswanath, 2018) that proposed to remove the top principal components from pre-trained embeddings. We experimentally verify our theoretical claims and show that retaining the top principal components is indeed useful for improving pre-trained word embeddings, without requiring access to additional linguistic resources or labeled data.

show abstract

“…Recently, the NLP community has focused on exploring gender bias in NLP systems (Sun et al, 2019), uncovering many gender disparities and harmful biases in algorithms and text (Cao and Chang and McKeown 2019;Costa-jussà 2019;Du et al 2019;Emami et al 2019;Garimella et al 2019;Gaut et al 2020;Habash et al 2019;Hashempour 2019;Hoyle et al 2019;Lee et al 2019a;Lepp 2019;Qian 2019;Sharifirad and Matwin 2019;Stanovsky et al 2019;O'Neil 2016;Blodgett et al 2020;Nangia et al 2020). Particular attention has been paid to uncovering, analyzing, and removing gender biases in word embeddings (Basta et al, 2019;Kaneko and Bollegala, 2019;Zhao et al, , 2018bBolukbasi et al, 2016). This word embedding work has even extended to multilingual work on gender-marking Williams et al, 2019;Zhou et al, 2019;.…”

Section: Related Workmentioning

confidence: 99%

Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation

Dinan

Fan

Williams

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

102

View full text Add to dashboard Cite

Social biases present in data are often directly reflected in the predictions of models trained on that data. We analyze gender bias in dialogue data, and examine how this bias is not only replicated, but is also amplified in subsequent generative chit-chat dialogue models. We measure gender bias in six existing dialogue datasets before selecting the most biased one, the multi-player textbased fantasy adventure dataset LIGHT (Urbanek et al., 2019), as a testbed for bias mitigation techniques. We consider three techniques to mitigate gender bias: counterfactual data augmentation, targeted data collection, and bias controlled training. We show that our proposed techniques mitigate gender bias by balancing the genderedness of generated dialogue utterances, and find that they are particularly effective in combination. We evaluate model performance with a variety of quantitative methods-including the quantity of gendered words, a dialogue safety classifier, and human assessments-all of which show that our models generate less gendered, but equally engaging chit-chat responses.

show abstract

Gender-preserving Debiasing for Pre-trained Word Embeddings

Cited by 94 publications

References 48 publications

Multi-Dimensional Gender Bias Classification

Multi-Dimensional Gender Bias Classification

Autoencoding Improves Pre-trained Word Embeddings

Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation

Contact Info

Product

Resources

About