2020
DOI: 10.1016/j.ipm.2020.102265
|View full text |Cite
|
Sign up to set email alerts
|

Video question answering via grounded cross-attention network learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
11
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 10 publications
0
11
0
1
Order By: Relevance
“…What Who How When Where All YouTube2Text-QA result. In Table 5, we compare our methods with the state-of-the-art r-ANL [30] on YouTube2Text-QA dataset. It's worth mentioning that r-ANL utilized frame-level attributes as additional supervision to augment learning while our method does not.…”
Section: Methods Question Typementioning
confidence: 99%
See 4 more Smart Citations
“…What Who How When Where All YouTube2Text-QA result. In Table 5, we compare our methods with the state-of-the-art r-ANL [30] on YouTube2Text-QA dataset. It's worth mentioning that r-ANL utilized frame-level attributes as additional supervision to augment learning while our method does not.…”
Section: Methods Question Typementioning
confidence: 99%
“…0.262). We also report the perclass accuracy to make direct comparison with [30], and our method is better than r-ANL in this evaluation method.…”
Section: Methods Question Typementioning
confidence: 99%
See 3 more Smart Citations