2017
DOI: 10.48550/arxiv.1712.00733
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Incorporating External Knowledge to Answer Open-Domain Visual Questions with Dynamic Memory Networks

Abstract: Visual Question Answering (VQA) has attracted much attention since it offers insight into the relationships between the multi-modal analysis of images and natural language. Most of the current algorithms are incapable of answering open-domain questions that require to perform reasoning beyond the image contents. To address this issue, we propose a novel framework which endows the model capabilities in answering more complex questions by leveraging massive external knowledge with dynamic memory networks. Specif… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(22 citation statements)
references
References 30 publications
0
22
0
Order By: Relevance
“…As consistent with the results on FVQA, we achieve a significant improvement (8.13% on top-1 accuracy and 16.51% on top-3 accuracy ) over state-of-the-art models. Note that our proposed GRUC network is an single-model, which outperforms the existing ensembled model [21]. We believe that the performance can be further improved if the technique of ensemble is involved in our model.…”
Section: Experimental Results On Visual7w-kbmentioning
confidence: 84%
See 2 more Smart Citations
“…As consistent with the results on FVQA, we achieve a significant improvement (8.13% on top-1 accuracy and 16.51% on top-3 accuracy ) over state-of-the-art models. Note that our proposed GRUC network is an single-model, which outperforms the existing ensembled model [21]. We believe that the performance can be further improved if the technique of ensemble is involved in our model.…”
Section: Experimental Results On Visual7w-kbmentioning
confidence: 84%
“…However, the visual information is wholly provided which may intro-duce redundant information for reasoning the answer. The same problem also exists in [21], although they leveraged dynamic memory network instead of graph convolutional netowrk to incorporate the external knowledge. Recent work [22] proposed a new knowledge-based task OK-VQA and introduced a retrieval-based model to extract the correct answer from Wikipedia.…”
Section: Incorporating External Knowledge In Vqamentioning
confidence: 99%
See 1 more Smart Citation
“…External knowledge has gained great interest in natural language processing [3,17] and computer vision [1,11,34]. As the information extracted from training sets are always insufficient to fully recover the real knowledge domain, previous works explicitly incorporate external knowledge to compensate it.…”
Section: External Knowledge Distillationmentioning
confidence: 99%
“…Other methods focused on integrating external prior knowledge, mostly by producing a query to a knowledge database using the question and the image [38]. Extracted external knowledge was also fused with question and image representations [41,26].…”
Section: Related Workmentioning
confidence: 99%