Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1159
|View full text |Cite
|
Sign up to set email alerts
|

Mitigating Gender Bias in Natural Language Processing: Literature Review

Abstract: As Natural Language Processing (NLP) and Machine Learning (ML) tools rise in popularity, it becomes increasingly vital to recognize the role they play in shaping societal biases and stereotypes. Although NLP models have shown success in modeling various applications, they propagate and may even amplify gender bias found in text corpora. While the study of bias in artificial intelligence is not new, methods to mitigate gender bias in NLP are relatively nascent. In this paper, we review contemporary studies on r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
228
0
2

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 317 publications
(268 citation statements)
references
References 48 publications
0
228
0
2
Order By: Relevance
“…For instance, dense vector representations called word embeddings 86 are able to capture semantic relationships between words, such as sex, gender and ethnic relationships 87 , thus absorbing biases existing in the training corpus 88 . Methods for bias mitigation in NLP have been recently reviewed, including learning gender-neutral embeddings and tagging the data points to preserve the gender of the source 89 .…”
Section: Natural Language Processingmentioning
confidence: 99%
“…For instance, dense vector representations called word embeddings 86 are able to capture semantic relationships between words, such as sex, gender and ethnic relationships 87 , thus absorbing biases existing in the training corpus 88 . Methods for bias mitigation in NLP have been recently reviewed, including learning gender-neutral embeddings and tagging the data points to preserve the gender of the source 89 .…”
Section: Natural Language Processingmentioning
confidence: 99%
“…Alternate models also exist that build embeddings from medical databases and the scientific literature, however for this paper we focus on the use of Word2Vec and GloVe, as opposed to the narrower datasets described in more detail in the paper by Kalyan et al [52]. As described by Pennington et al GloVe embeddings were trained on text copora from Wikipedia data, Gigaword and web data from Common Crawl which built a vocabulary of 400,000 frequent words [57]. Word2Vec was trained on the Google News dataset (containined~100billion words) which resulted in a model of 300-dimensional vectors for 3 million words and phrases [58].…”
Section: Plos Onementioning
confidence: 99%
“…Gender classification from text is a fundamental task in author profiling, and in particular author profiling on social media has recently received a lot of attention from the NLP community (Bamman et al, 2014;Sap et al, 2014;Ciot et al, 2013). Additionally, gender is often in the spotlight of research of fairness and bias in NLP (Sun et al, 2019). Biases are often introduced by demographic and other imbalances in training data.…”
Section: Gender Classification Biasmentioning
confidence: 99%