2021
DOI: 10.48550/arxiv.2112.14168
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Survey on Gender Bias in Natural Language Processing

Abstract: Language can be used as a means of reproducing and enforcing harmful stereotypes and biases and has been analysed as such in numerous research. In this paper, we present a survey of 304 papers on gender bias in natural language processing. We analyse definitions of gender and its categories within social sciences and connect them to formal definitions of gender bias in NLP research. We survey lexica and datasets applied in research on gender bias and then compare and contrast approaches to detecting and mitiga… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(35 citation statements)
references
References 105 publications
(246 reference statements)
0
14
0
Order By: Relevance
“…The problem is then in defining this list of words. To avoid overfitting to one axis of gender bias, we construct a composite score based on pre-existing lists which have in turn been defined through experiments and empirical assessments (Schmader et al, 2007;Gaucher et al, 2011;Sap et al, 2017;Stanczak and Augenstein, 2021). The presence of words which are more likely to be associated with one gender does not directly result in biased outcomes.…”
Section: Measuring Biasmentioning
confidence: 99%
See 1 more Smart Citation
“…The problem is then in defining this list of words. To avoid overfitting to one axis of gender bias, we construct a composite score based on pre-existing lists which have in turn been defined through experiments and empirical assessments (Schmader et al, 2007;Gaucher et al, 2011;Sap et al, 2017;Stanczak and Augenstein, 2021). The presence of words which are more likely to be associated with one gender does not directly result in biased outcomes.…”
Section: Measuring Biasmentioning
confidence: 99%
“…n positive signifiers n words NCR VAD Lexicon This measure is based on a list of words rated on the emotional dimensions of valence, arousal, and dominance which has been used in gender bias research. In particular, weakness (low dominance), passiveness (low arousal or agency), and badness (valence) may be associated with a female stereotype (Stanczak and Augenstein, 2021). Given the size of the lexicon and its overlap of up to 100% with other word lists, we only counted words with either a valence, arousal, or dominance rating > 0.75 on a scale from 0 to 1.…”
Section: E Constructing Bias Measuresmentioning
confidence: 99%
“…Sun et al (2021) propose a rewriting task where data is transferred from gendered to gender-neutral pronouns to train more inclusive language models. Cao and Daumé III (2020) and Dev et al (2021) discuss the necessity of including non-binary pronouns into NLP research (see also Stanczak and Augenstein (2021)).…”
Section: Related Workmentioning
confidence: 99%
“…Bias in NLP systems often goes without notice, it's often not even detected until after the systems are launched and used by consumers, which can have adverse effects on our society, such as when it shows false information to people which leads them to believe untrue things about society or them-selves; thereby changing their behavior for better or worse (Stanczak and Augenstein, 2021). The harm of bias in NLP has been understated by some people and overstated by others, who dismiss its relevance or refuse to engage with it altogether.…”
Section: Bias Statementmentioning
confidence: 99%