Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.63
|View full text |Cite
|
Sign up to set email alerts
|

MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering

Abstract: While progress has been made on the visual question answering leaderboards, models often utilize spurious correlations and priors in datasets under the i.i.d. setting. As such, evaluation on out-of-distribution (OOD) test samples has emerged as a proxy for generalization. In this paper, we present MUTANT, a training paradigm that exposes the model to perceptually similar, yet semantically distinct mutations of the input, to improve OOD generalization, such as the VQA-CP challenge. Under this paradigm, models u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
65
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 84 publications
(72 citation statements)
references
References 43 publications
0
65
0
Order By: Relevance
“…Training and testing under the independent and identically distributed (i.i.d) setting have resulted in the performance of most VQA models being highly affected by superficial correlations (i.e., language biases and dataset biases) [1,2,20,74]. Recently, evaluation on the out-of-distribution (OOD) setting [18,24,35,60] has thus become an increasing concern for VQA. To improve the OOD generalization performance of VQA models, the prevailing methods target eliminating the language bias.…”
Section: Related Work 21 Ood Generalization In Vqamentioning
confidence: 99%
See 2 more Smart Citations
“…Training and testing under the independent and identically distributed (i.i.d) setting have resulted in the performance of most VQA models being highly affected by superficial correlations (i.e., language biases and dataset biases) [1,2,20,74]. Recently, evaluation on the out-of-distribution (OOD) setting [18,24,35,60] has thus become an increasing concern for VQA. To improve the OOD generalization performance of VQA models, the prevailing methods target eliminating the language bias.…”
Section: Related Work 21 Ood Generalization In Vqamentioning
confidence: 99%
“…To improve the OOD generalization performance of VQA models, the prevailing methods target eliminating the language bias. Accordingly, current debiasing methods to VQA can be broadly divided into two groups, Known Bias-based [7,10,45] and Unknown Bias-based [11,18,58].…”
Section: Related Work 21 Ood Generalization In Vqamentioning
confidence: 99%
See 1 more Smart Citation
“…Most of them are designed for the language priors problem, while LXMERT represents the recent trend towards utilizing BERT-like pre-trained models (Li et al, 2019;Chen et al, 2020b;Li et al, 2020) which have top performances on various downstream vision and language tasks (including VQA-v2). Note that MUTANT (Gokhale et al, 2020) uses the extra object-name label to ground the textual concepts in the image. For fair comparison, we do not compare with MUTANT.…”
Section: Inference Processmentioning
confidence: 99%
“…Apart from coming up with newer architecture to tackle the VQA problem, there are training techniques [18,19,20,21] put forward which might help to increase the accuracy. A special care is taken in the dataset while training which takes the semantic changes in the input data into consideration that might affect the output.…”
Section: Related Workmentioning
confidence: 99%