Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2021
DOI: 10.18653/v1/2021.emnlp-main.116
|View full text |Cite
|
Sign up to set email alerts
|

Debiasing Methods in Natural Language Understanding Make Bias More Accessible

Abstract: Model robustness to bias is often determined by the generalization on carefully designed out-of-distribution datasets. Recent debiasing methods in natural language understanding (NLU) improve performance on such datasets by pressuring models into making unbiased predictions. An underlying assumption behind such methods is that this also leads to the discovery of more robust features in the model's inner representations. We propose a general probing-based framework that allows for posthoc interpretation of bias… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
1

Year Published

2022
2022
2023
2023

Publication Types

Select...
5

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 25 publications
0
3
1
Order By: Relevance
“…Our findings suggest that practitioners of NLP should take special care when adopting previously debiased models and inspect them carefully, perhaps using our framework. Our results differ from those of Mendelson and Belinkov (2021a), who found that the debiasing increases bias extractability as measured by compression rate. However, they studied different, non-social biases, that arise from spurious or unintended correlations in training datasets (often called dataset biases).…”
Section: Discussioncontrasting
confidence: 99%
See 1 more Smart Citation
“…Our findings suggest that practitioners of NLP should take special care when adopting previously debiased models and inspect them carefully, perhaps using our framework. Our results differ from those of Mendelson and Belinkov (2021a), who found that the debiasing increases bias extractability as measured by compression rate. However, they studied different, non-social biases, that arise from spurious or unintended correlations in training datasets (often called dataset biases).…”
Section: Discussioncontrasting
confidence: 99%
“…We use the MDL probe (Voita and Titov, 2020) implementation by Mendelson and Belinkov (2021b). In all experiments, we use a linear probe and train it with a batch size of 16 and a learning rate of 1e-3.…”
Section: A3 Probing Classifiermentioning
confidence: 99%
“…Our work is similar, but examines fundamental questions about what models will learn from debiasing procedures. Mendelson and Belinkov (2021) show through a probing experiment that debiasing against a particular bias may increase the extent to which that bias is encoded in the inner representations of models. In this work, we study how debiasing procedures affect model behavior, as probe performance is not necessarily indicative of the information which a model actually uses to make predictive decisions (Ravichander et al, 2021;Elazar et al, 2021).…”
Section: Related Work and Backgroundmentioning
confidence: 98%
“…For instance, a bias model may be trained adversarially, making the main model perform worse when the bias model performs well (Belinkov et al, 2019b;Stacey et al, 2020). Others use a bias model to modulate the main model's predictions in various ways (He et al, 2019;Karimi Mahabadi et al, 2020;Utama et al, 2020b;Sanh et al, 2021;Mendelson and Belinkov, 2021). All these approaches use discriminative models to estimate p(y | P, H).…”
Section: Mitigation Strategiesmentioning
confidence: 99%