Proceedings of the 4th Workshop on Gender Bias in Natural Language Processing (GeBNLP) 2022
DOI: 10.18653/v1/2022.gebnlp-1.27
|View full text |Cite
|
Sign up to set email alerts
|

Why Knowledge Distillation Amplifies Gender Bias and How to Mitigate from the Perspective of DistilBERT

Abstract: Knowledge distillation is widely used to transfer the language understanding of a large model to a smaller model. However, after knowledge distillation, it was found that the smaller model is more biased by gender compared to the source large model. This paper studies what causes gender bias to increase after the knowledge distillation process. Moreover, we suggest applying a variant of the mixup on knowledge distillation, which is used to increase generalizability during the distillation process, not for augm… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 10 publications
(15 reference statements)
2
4
0
Order By: Relevance
“…The only exception is Distilled mUSE, where the job-prestige dimension applied to countries still correlates with the country's GDP and the east-west axis. This is consistent with previous work showing that distilled student models exhibit more biases than model trained on authentic data (Vamvas and Sennrich, 2021;Ahn et al, 2022).…”
Section: Resultssupporting
confidence: 93%
See 1 more Smart Citation
“…The only exception is Distilled mUSE, where the job-prestige dimension applied to countries still correlates with the country's GDP and the east-west axis. This is consistent with previous work showing that distilled student models exhibit more biases than model trained on authentic data (Vamvas and Sennrich, 2021;Ahn et al, 2022).…”
Section: Resultssupporting
confidence: 93%
“…This result suggests that the models do not connect individual social prestige with the country of origin. The exception is a small model distilled from Multilingual Universal Sentence Encoder (Yang et al, 2020) that seems to mix these two and thus confirms previous work claiming that distilled models are more prone to biases (Ahn et al, 2022).…”
Section: Introductionsupporting
confidence: 83%
“…Since such a biased ratio is not favorable, the generative AI software that mimics such a biased ratio is also not favorable. On the other hand, there is also a chance that AI models could generate a more biased ratio [1]. Such imbalanced generations may reinforce the bias or stereotypes.…”
Section: Social Biasmentioning
confidence: 99%
“…Image generation models, which generate images from a given text, have recently drawn a lot of interest from academia and the industry. For example, Stable Diffusion [37], an open-sourced latent text-toimage diffusion model, has 60K stars on github 1 . And Midjourney, an AI image generation commercial software product launched on July 2022, has more than 15 million users [13].…”
Section: Introductionmentioning
confidence: 99%
“…Finally, while the interplay and tradeoff between privacy, efficiency, and fairness in tabular data has received extensive examination (Hooker et al, 2020;Lyu et al, 2020) comparatively fewer studies have been conducted in NLP (Tal et al, 2022;Ahn et al, 2022;Hessenthaler et al, 2022).…”
Section: Introductionmentioning
confidence: 99%