2021
DOI: 10.48550/arxiv.2105.04054
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Societal Biases in Language Generation: Progress and Challenges

Abstract: Technology for language generation has advanced rapidly, spurred by advancements in pre-training large models on massive amounts of data and the need for intelligent agents to communicate in a natural manner. While techniques can effectively generate fluent text, they can also produce undesirable societal biases that can have a disproportionately negative impact on marginalized populations. Language generation presents unique challenges for biases in terms of direct user interaction and the structure of decodi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 72 publications
(74 reference statements)
0
7
0
Order By: Relevance
“…Distinguishing "statistical bias" from "social bias" Concerns regarding "bias" in language models generally revolve around distributional skews that result in unfavourable impacts for particular social groups (Sheng et al, 2021). We note that there are different definitions of "bias" and "discrimination" in classical statistics compared to sociotechnical studies.…”
Section: Discussionmentioning
confidence: 89%
“…Distinguishing "statistical bias" from "social bias" Concerns regarding "bias" in language models generally revolve around distributional skews that result in unfavourable impacts for particular social groups (Sheng et al, 2021). We note that there are different definitions of "bias" and "discrimination" in classical statistics compared to sociotechnical studies.…”
Section: Discussionmentioning
confidence: 89%
“…Before ChatGPT emerged, extensive academic research was conducted on the ethical risks associated with LLMs, particularly those involved in natural language generation (NLG). These investigations delved into potential societal impacts, highlighting concerns ranging from the confident distribution of inaccurate information to the creation of widespread false news and information [20,21]. With the advent of ChatGPT, concerns have increased, as this study highlighted its risks [22].…”
Section: Large Language Modelsmentioning
confidence: 91%
“…LGMs relates to the social harms that arise from the model performing more poorly for some demographic groups, generating discriminatory speech, or further propagating discriminatory outcomes through the generated text [1,29].…”
Section: Fairness Inmentioning
confidence: 99%
“…Despite the ever-increasing power of LGMs in generating realistic and cohesive language, they are also susceptible to learning harmful language and encoding undesirable bias across identities that can retain and magnify harmful content and stereotypes [5,28,29,33]. This reality necessitates that both the developers and the ultimate users of an LGM are keenly aware of its ethical risk levels to ensure reliable behavior.…”
Section: Introductionmentioning
confidence: 99%