2022
DOI: 10.3390/math10183244
|View full text |Cite
|
Sign up to set email alerts
|

Path-Wise Attention Memory Network for Visual Question Answering

Abstract: Visual question answering (VQA) is regarded as a multi-modal fine-grained feature fusion task, which requires the construction of multi-level and omnidirectional relations between nodes. One main solution is the composite attention model which is composed of co-attention (CA) and self-attention(SA). However, the existing composite models only consider the stack of single attention blocks, lack of path-wise historical memory, and overall adjustments. We propose a path attention memory network (PAM) to construct… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 63 publications
(110 reference statements)
0
1
0
Order By: Relevance
“…That is, the observation can be adjusted to the more informative features according to their relative importance, focusing the algorithm on the most relevant parts of the input, moving from focusing on global features to the focused features, thus saving resources and getting the most effective information quickly. The attention mechanism has arguably become one of the most important concepts in the field of deep learning, since Bahdanau, Cho & Bengio (2015) used attention mechanism for the machine interpretation tasks, various variants of attention mechanism have emerged, such as Co-Attention networks ( Yang et al, 2019a ; Han et al, 2021 ; Yu et al, 2019 ; Liu et al, 2021b ; Lu et al, 2016 ; Sharma & Srivastava, 2022 ), Recurrent Attention networks ( Osman & Samek, 2019 ; Ren & Zemel, 2017 ; Gan et al, 2019 ), Self-Attention networks ( Li et al, 2019 ; Fan et al, 2019 ; Ramachandran et al, 2019 ; Xia et al, 2022 ; Xiang et al, 2022 ; Yan, Silamu & Li, 2022 ), etc. The effectiveness of visual information processing is considerably enhanced by all of these attention mechanisms, which also optimize VQA performance.…”
Section: Introductionmentioning
confidence: 99%
“…That is, the observation can be adjusted to the more informative features according to their relative importance, focusing the algorithm on the most relevant parts of the input, moving from focusing on global features to the focused features, thus saving resources and getting the most effective information quickly. The attention mechanism has arguably become one of the most important concepts in the field of deep learning, since Bahdanau, Cho & Bengio (2015) used attention mechanism for the machine interpretation tasks, various variants of attention mechanism have emerged, such as Co-Attention networks ( Yang et al, 2019a ; Han et al, 2021 ; Yu et al, 2019 ; Liu et al, 2021b ; Lu et al, 2016 ; Sharma & Srivastava, 2022 ), Recurrent Attention networks ( Osman & Samek, 2019 ; Ren & Zemel, 2017 ; Gan et al, 2019 ), Self-Attention networks ( Li et al, 2019 ; Fan et al, 2019 ; Ramachandran et al, 2019 ; Xia et al, 2022 ; Xiang et al, 2022 ; Yan, Silamu & Li, 2022 ), etc. The effectiveness of visual information processing is considerably enhanced by all of these attention mechanisms, which also optimize VQA performance.…”
Section: Introductionmentioning
confidence: 99%