2019
DOI: 10.48550/arxiv.1904.03282
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Weakly Supervised Video Moment Retrieval From Text Queries

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 0 publications
0
0
0
Order By: Relevance
“…The model inputs both image and video data and a test description of the events taking place. Three application tasks are handled (Action Understanding [11], Video-Language Alignment [12] and Video Open Understanding [5]) and used a set of corresponding datasets to work with each of them (Kinetics [13], ActivityNet [14], MSR-VTT [15], DiDeMo [16], UFC101-HMDB51 [17], Ego4d [18], etc. ).…”
Section: Fig 7 Bar Chart Of All Three Models Performancementioning
confidence: 99%
“…The model inputs both image and video data and a test description of the events taking place. Three application tasks are handled (Action Understanding [11], Video-Language Alignment [12] and Video Open Understanding [5]) and used a set of corresponding datasets to work with each of them (Kinetics [13], ActivityNet [14], MSR-VTT [15], DiDeMo [16], UFC101-HMDB51 [17], Ego4d [18], etc. ).…”
Section: Fig 7 Bar Chart Of All Three Models Performancementioning
confidence: 99%