2019 First International Conference on Graph Computing (GC) 2019
DOI: 10.1109/gc46384.2019.00015
|View full text |Cite
|
Sign up to set email alerts
|

Visual Question Answering over Scene Graph

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(11 citation statements)
references
References 12 publications
2
9
0
Order By: Relevance
“…When the validation set is limited to those questions which have a consistent functional form, then the validation accuracy becomes (93%), which is a better indicator of the model's performance on the GQA datasets. This is comparable to the current state of the art for the GQA dataset under perfect site configuration [9] (96.3%). Aside from the fact that the proposed solution is also highly interpretable and the reasoning steps are easy to follow, the training of the transformer is much more easier to perform than a Graph Neural Network used in [9].…”
Section: Modelssupporting
confidence: 79%
See 3 more Smart Citations
“…When the validation set is limited to those questions which have a consistent functional form, then the validation accuracy becomes (93%), which is a better indicator of the model's performance on the GQA datasets. This is comparable to the current state of the art for the GQA dataset under perfect site configuration [9] (96.3%). Aside from the fact that the proposed solution is also highly interpretable and the reasoning steps are easy to follow, the training of the transformer is much more easier to perform than a Graph Neural Network used in [9].…”
Section: Modelssupporting
confidence: 79%
“…This is comparable to the current state of the art for the GQA dataset under perfect site configuration [9] (96.3%). Aside from the fact that the proposed solution is also highly interpretable and the reasoning steps are easy to follow, the training of the transformer is much more easier to perform than a Graph Neural Network used in [9]. Table IV summarizes the model performance on the GQA dataset.…”
Section: Modelssupporting
confidence: 79%
See 2 more Smart Citations
“…Scene graph generation (SGG) [13], which is a visual task to detect objects and recognize semantic relationships between different objects in an image, can serve as a powerful structural representation of images and benefit other high-level Vision-and-Language tasks such as image generation [12,33,40], image retrieval [13,24,28,34], visual question answering [6,18,39] and image captioning [7,19,38]. Taking advantage of the remarkable feature representations of convolutional neural networks (CNNs) [16] and diverse contextual feature fusion strategies (e.g., message passing [20,37], lstm [41]), a variety of methods have made significant progress to improve the recall evaluation metric performance of SGG tasks.…”
Section: Introductionmentioning
confidence: 99%