2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.01081
|View full text |Cite
|
Sign up to set email alerts
|

Counterfactual Samples Synthesizing for Robust Visual Question Answering

Abstract: Today's VQA models still tend to capture superficial linguistic correlations in the training set and fail to generalize to the test set with different QA distributions. To reduce these language biases, recent VQA works introduce an auxiliary question-only model to regularize the training of targeted VQA model, and achieve dominating performance on diagnostic benchmarks for out-of-distribution testing. However, due to complex model design, these ensemble-based methods are unable to equip themselves with two ind… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
162
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 257 publications
(163 citation statements)
references
References 46 publications
1
162
0
Order By: Relevance
“…Counterfactual sample. Constructing counterfactual samples has become an emergent data augmentation technique in natural language processing, which has been used in a wide spectral of language understanding tasks, including SA (Kaushik et al, 2019;, NLI (Kaushik et al, 2019), named entity recognition (Zeng et al, 2020) question answering (Chen et al, 2020), dialogue system , vision-language navigation (Fu et al, 2020). Beyond data augmentation under the standard supervised learning paradigm, a line of research explores to incorporate counterfactual samples into other learning paradigms such as adversarial training Fu et al, 2020;Teney et al, 2020) and contrastive learning (Liang et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Counterfactual sample. Constructing counterfactual samples has become an emergent data augmentation technique in natural language processing, which has been used in a wide spectral of language understanding tasks, including SA (Kaushik et al, 2019;, NLI (Kaushik et al, 2019), named entity recognition (Zeng et al, 2020) question answering (Chen et al, 2020), dialogue system , vision-language navigation (Fu et al, 2020). Beyond data augmentation under the standard supervised learning paradigm, a line of research explores to incorporate counterfactual samples into other learning paradigms such as adversarial training Fu et al, 2020;Teney et al, 2020) and contrastive learning (Liang et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Given the labeled factual sample, counterfactual samples can be constructed either manually(Kaushik et al, 2019) or automatically(Chen et al, 2020) by conducting minimum changes on x to swap its label from y to c…”
mentioning
confidence: 99%
“…Apart from coming up with newer architecture to tackle the VQA problem, there are training techniques [18,19,20,21] put forward which might help to increase the accuracy. A special care is taken in the dataset while training which takes the semantic changes in the input data into consideration that might affect the output.…”
Section: Related Workmentioning
confidence: 99%
“…Some other works (such as DCN [39], BAN [40], and MCAN [41]) investigate "dense" co-attention that use bidirectional attention between images and questions. More recent works try to capture a more complex visual-textual information [42]- [45]. Our work instead tries to keep our approach as simple as possible by using three independently trained models to obtain the entropy.…”
Section: Related Workmentioning
confidence: 99%