2020
DOI: 10.48550/arxiv.2004.09456
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

StereoSet: Measuring stereotypical bias in pretrained language models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
91
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 75 publications
(129 citation statements)
references
References 0 publications
1
91
2
Order By: Relevance
“…Whether using such unlabeled data, as we do in this work, can help with bias is still an open question. Previous work suggests that training on large amounts of data alone is not sufficient to avoid unwanted biases, since many papers have pointed out biases in large language models (Abid et al, 2021;Nadeem et al, 2020;Gehman et al, 2020). However, recent work has also suggested that pre-trained models can be trained to be more robust against some types of spurious correlations (Hendrycks et al, 2020;Tu et al, 2020) and that additional domain-and task-specific pre-training can also improve performance.…”
Section: A7 Civilcomments-wildsmentioning
confidence: 99%
“…Whether using such unlabeled data, as we do in this work, can help with bias is still an open question. Previous work suggests that training on large amounts of data alone is not sufficient to avoid unwanted biases, since many papers have pointed out biases in large language models (Abid et al, 2021;Nadeem et al, 2020;Gehman et al, 2020). However, recent work has also suggested that pre-trained models can be trained to be more robust against some types of spurious correlations (Hendrycks et al, 2020;Tu et al, 2020) and that additional domain-and task-specific pre-training can also improve performance.…”
Section: A7 Civilcomments-wildsmentioning
confidence: 99%
“…Large language models have achieved impressive results on many tasks; however, there is also significant evidence demonstrating that they are prone to biases [4], [38], [39]. Debiasing these models remains largely an open problem: most in-processing algorithms are not applicable or computationally prohibitive due to large and highly complex model architectures, and challenges in handling text inputs.…”
Section: Post-processing For Debiasing Large Language Modelsmentioning
confidence: 99%
“…However, pretrained LMs are well-known for exhibiting unintended social biases involving race, gender, or religion [28,31,42]. These biases result from unfair allocation of resources [20,51], stereotyping that propagates negative generalizations about particular social groups [35], as well as differences in system performance for different social groups, text that misrepresents the distribution of different social groups in the population, or language that is denigrating to particular social groups [4,18,28]. Moreover, these biases may also be exacerbated by biases used for domain-specific LM fine-tuning used for downstream tasks [22,35].…”
Section: Introductionmentioning
confidence: 99%