Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

Utama, Prasetya Ajie; Moosavi, Nafise Sadat; Gurevych, Iryna

doi:10.18653/v1/2020.acl-main.770

Cited by 51 publications

(28 citation statements)

References 24 publications

(49 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We note that there are three different criteria for controlling the debiasing strategy: (1) Models may be trained end-to-end by propagating errors to the weak learner as well as the main model (Mahabadi et al, 2020) or in a pipeline, where the weak learner is trained first and frozen, such that only its predictions are used to tune the combination loss (He et al, 2019;Clark et al, 2019;Sanh et al, 2021;Utama et al, 2020a). ( 2…”

Section: Debiasing Methodsmentioning

confidence: 99%

Debiasing Methods in Natural Language Understanding Make Bias More Accessible

Mendelson¹,

Belinkov²

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Model robustness to bias is often determined by the generalization on carefully designed out-of-distribution datasets. Recent debiasing methods in natural language understanding (NLU) improve performance on such datasets by pressuring models into making unbiased predictions. An underlying assumption behind such methods is that this also leads to the discovery of more robust features in the model's inner representations. We propose a general probing-based framework that allows for posthoc interpretation of biases in language models, and use an information-theoretic approach to measure the extractability of certain biases from the model's representations. We experiment with several NLU datasets and known biases, and show that, counter-intuitively, the more a language model is pushed towards a debiased regime, the more bias is actually encoded in its inner representations. 1 * Supported by the Viterbi Fellowship in the Center for Computer Engineering at the Technion.

show abstract

Section: Debiasing Methodsmentioning

confidence: 99%

Debiasing Methods in Natural Language Understanding Make Bias More Accessible

Mendelson¹,

Belinkov²

2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…contradiction H: The woman is not awake. Recent popular solution to such issue is to develop debiasing mehods that overcome these biases at the training stage (Belinkov et al 2019;He, Zha, and Wang 2019;Stacey et al 2020;Mahabadi, Belinkov, and Henderson 2020;Utama, Moosavi, and Gurevych 2020;Ghaddar et al 2021). Namely, they first use a bias model 1 to identify biased samples.…”

Section: Premise and Hypothesismentioning

confidence: 99%

Debiasing NLU Models via Causal Intervention and Counterfactual Reasoning

Tian

Cao

Zhang

et al. 2022

AAAI

View full text Add to dashboard Cite

Recent studies have shown that strong Natural Language Understanding (NLU) models are prone to relying on annotation biases of the datasets as a shortcut, which goes against the underlying mechanisms of the task of interest. To reduce such biases, several recent works introduce debiasing methods to regularize the training process of targeted NLU models. In this paper, we provide a new perspective with causal inference to find out the bias. On one hand, we show that there is an unobserved confounder for the natural language utterances and their respective classes, leading to spurious correlations from training data. To remove such confounder, the backdoor adjustment with causal intervention is utilized to find the true causal effect, which makes the training process fundamentally different from the traditional likelihood estimation. On the other hand, in inference process, we formulate the bias as the direct causal effect and remove it by pursuing the indirect causal effect with counterfactual reasoning. We conduct experiments on large-scale natural language inference and fact verification benchmarks, evaluating on bias sensitive datasets that are specifically designed to assess the robustness of models against known biases in the training data. Experimental results show that our proposed debiasing framework outperforms previous state-of-the-art debiasing methods while maintaining the original in-distribution performance.

show abstract

“…In the second approach, they first recognize examples that contain artifacts, and use this knowledge in the training objective to either skip or downweight biased examples (He et al, 2019;Clark et al, 2019a), or to regularize the confidence of the model on those examples (Utama et al, 2020a). The use of this information in the training objective improves the robustness of the model on adversarial datasets (He et al, 2019;Clark et al, 2019a;Utama et al, 2020a), i.e., datasets that contain counterexamples in which relying on the bias results in an incorrect prediction. In addition, it can also improve in-domain performances as well as generalization across various datasets that represent the same task (Wu et al, 2020a;Utama et al, 2020b).…”

Section: Artifacts In Nlp Datasetsmentioning

confidence: 99%

Coreference Reasoning in Machine Reading Comprehension

Wu¹,

Moosavi²,

Roth³

et al. 2021

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer

Self Cite

View full text Add to dashboard Cite

Coreference resolution is essential for natural language understanding and has been long studied in NLP. In recent years, as the format of Question Answering (QA) became a standard for machine reading comprehension (MRC), there have been data collection efforts, e.g., , that attempt to evaluate the ability of MRC models to reason about coreference. However, as we show, coreference reasoning in MRC is a greater challenge than earlier thought; MRC datasets do not reflect the natural distribution and, consequently, the challenges of coreference reasoning. Specifically, success on these datasets does not reflect a model's proficiency in coreference reasoning. We propose a methodology for creating MRC datasets that better reflect the challenges of coreference reasoning and use it to create a sample evaluation set. The results on our dataset show that state-ofthe-art models still struggle with these phenomena. Furthermore, we develop an effective way to use naturally occurring coreference phenomena from existing coreference resolution datasets when training MRC models. This allows us to show an improvement in the coreference reasoning abilities of state-of-theart models. 1 Passage in CoNLLMention Cluster CoNLLdec Quesion CoNLLbart Question Gold Answer

show abstract

Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

Cited by 51 publications

References 24 publications

Debiasing Methods in Natural Language Understanding Make Bias More Accessible

Debiasing Methods in Natural Language Understanding Make Bias More Accessible

Debiasing NLU Models via Causal Intervention and Counterfactual Reasoning

Coreference Reasoning in Machine Reading Comprehension

Contact Info

Product

Resources

About