Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

Stacey, Joe; Minervini, Pasquale; Dubossarsky, Haim; Riedel, Sebastian; Rocktäschel, Tim

doi:10.18653/v1/2020.emnlp-main.665

Cited by 26 publications

(31 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The initial motivation for this work was to learn patterns that a model would not learn by default because of the simplicity bias. The closest existing works were "debiasing methods" popular for visual question answering [7,10,31] and NLP [3,11,41,70,76]. They typically train one model that is biased by design (for example being fed a partial input) while a second model is subsequently trained to be different, hence more robust.…”

Section: Project Chronology and Negative Resultsmentioning

confidence: 99%

“…Debiasing. Methods for debiasing are concerned with improving generalization of models against a precisely identified (undesirable) factor of variation in the data [3,11,41,48,70,76]. In computer vision for example, this can be removing the bias towards texture in the ImageNet dataset [5,18].…”

Section: Related Workmentioning

confidence: 99%

“…Current approaches to recover the missing information use multiple training environments [1,12,55], counterfactual examples [25,72], or non-stationary time series [23,30,57]. Other options to improve OOD generalization rely on ad hoc task-specific knowledge [3,11,41,48,70,76].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization

Teney¹,

Abbasnejad²,

Lucey³

et al. 2021

Preprint

View full text Add to dashboard Cite

Neural networks trained with SGD were recently shown to rely preferentially on linearly-predictive features and can ignore complex, equally-predictive ones. This simplicity bias can explain their lack of robustness out of distribution (OOD). The more complex the task to learn, the more likely it is that statistical artifacts (i.e. selection biases, spurious correlations) are simpler than the mechanisms to learn.We demonstrate that the simplicity bias can be mitigated and OOD generalization improved. We train a set of similar models to fit the data in different ways using a penalty on the alignment of their input gradients. We show theoretically and empirically that this induces the learning of more complex predictive patterns. OOD generalization fundamentally requires information beyond i.i.d. examples, such as multiple training environments, counterfactual examples, or other side information.Our approach shows that we can defer this requirement to an independent model selection stage. We obtain SOTA results in visual recognition on biased data and generalization across visual domains. The method -the first to evade the simplicity bias -highlights the need for a better understanding and control of inductive biases in deep learning.

show abstract

Section: Project Chronology and Negative Resultsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization

Teney¹,

Abbasnejad²,

Lucey³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Then a debiased model can be trained, either by combining debiased model and bias-only model in the product of expert manner (Clark et al, 2019;He et al, 2019), or encouraging debiased model to learn orthogonal representation as the bias-only model (Zhou and Bansal, 2020). Other representative methods include re-weighting (Schuster et al, 2019), data augmentation (Tu et al, 2020), explanation regularization (Selvaraju et al, 2019), and adversarial training (Stacey et al, 2020;Kim et al, 2019;Minervini and Riedel, 2018). Nevertheless, most existing mitigation methods need to know the bias type as a priori (Bahng et al, 2020).…”

Section: Related Workmentioning

confidence: 99%

Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU models

Du¹,

Manjunatha²,

Jain³

et al. 2021

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Recent studies indicate that NLU models are prone to rely on shortcut features for prediction, without achieving true language understanding. As a result, these models fail to generalize to real-world out-of-distribution data. In this work, we show that the words in the NLU training set can be modeled as a longtailed distribution. There are two findings: 1) NLU models have strong preference for features located at the head of the long-tailed distribution, and 2) Shortcut features are picked up during very early few iterations of the model training. These two observations are further employed to formulate a measurement which can quantify the shortcut degree of each training sample. Based on this shortcut measurement, we propose a shortcut mitigation framework LTGR, to suppress the model from making overconfident predictions for samples with large shortcut degree. Experimental results on three NLU benchmarks demonstrate that our long-tailed distribution explanation accurately reflects the shortcut learning behavior of NLU models. Experimental analysis further indicates that LTGR can improve the generalization accuracy on OOD data, while preserving the accuracy on in-distribution data. Input x Teacher model Softmax Softmax Ground truth y Smoothed Softmax Distill loss Student loss Overconfident prediction Long-tailed distribution Example of model paying high attention to features on the head (a) long-tailed observation (b) Mitigation framework Shortcut degree Student model Head Long tail Shortcut degree Data statistics Model behavior

show abstract

“…Recent studies indicate that pre-trained language models like BERT tend to exploit biases in the dataset for prediction, rather than acquiring higher-level semantic understanding and reasoning (Niven and Kao, 2019;Du et al, 2021;McCoy et al, 2019a). There are some preliminary works to mitigate the bias of general pre-trained models, including product-of-experts He et al, 2019;Sanh et al, 2021), reweighting (Schuster et al, 2019;Yaghoobzadeh et al, 2019;Utama et al, 2020), adversarial training (Stacey et al, 2020), posterior regularization (Cheng et al, 2021), etc. Recently, challenging benchmark datasets, e.g., Checklist (Ribeiro et al, 2020) and the Robustness Gym (Goel et al, 2021), have been developed to facilitate the evaluation of the robustness of these models.…”

Section: Related Workmentioning

confidence: 99%

What do Compressed Large Language Models Forget? Robustness Challenges in Model Compression

Mukherjee

Shokouhi

et al. 2021

Preprint

View full text Add to dashboard Cite

Recent works have focused on compressing pre-trained language models (PLMs) like BERT where the major focus has been to improve the compressed model performance for downstream tasks. However, there has been no study in analyzing the impact of compression on the generalizability and robustness of these models. Towards this end, we study two popular model compression techniques including knowledge distillation and pruning and show that compressed models are significantly less robust than their PLM counterparts on adversarial test sets although they obtain similar performance on in-distribution development sets for a task. Further analysis indicates that the compressed models overfit on the easy samples and generalize poorly on the hard ones. We further leverage this observation to develop a regularization strategy for model compression based on sample uncertainty. Experimental results on several natural language understanding tasks demonstrate our mitigation framework to improve both the adversarial generalization as well as in-distribution task performance of the compressed models.

show abstract

Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

Cited by 26 publications

References 41 publications

Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization

Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization

Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU models

What do Compressed Large Language Models Forget? Robustness Challenges in Model Compression

Contact Info

Product

Resources

About