2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00157
|View full text |Cite
|
Sign up to set email alerts
|

One-Shot Action Localization by Learning Sequence Matching Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
79
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 50 publications
(80 citation statements)
references
References 26 publications
1
79
0
Order By: Relevance
“…Video Re-localization [13] aims to find segments in reference videos semantically corresponding to a given query video. A more specialized task, one-shot action localization [52], focuses on the temporal detection of actions in videos giving an example. The STVR task to be solved in this paper is an extension of temporal video re-localization.…”
Section: Related Workmentioning
confidence: 99%
“…Video Re-localization [13] aims to find segments in reference videos semantically corresponding to a given query video. A more specialized task, one-shot action localization [52], focuses on the temporal detection of actions in videos giving an example. The STVR task to be solved in this paper is an extension of temporal video re-localization.…”
Section: Related Workmentioning
confidence: 99%
“…The action localization problem has been well-studied in the computer vision literature [13], [14]. The most closely related work is that of [15], in which they propose a similar few-shot action localization problem and solve it through a meta-learning framework. However, their approach requires a specialized network architecture-a full context embedding network-whereas our approach is fully general, allowing the flexibility of choosing any network architecture.…”
Section: Related Workmentioning
confidence: 99%
“…As the strength of CNN processing has improved, a recent trend in computer vision is to use a CNN to extract the features of an input image. The extracted features can be used for further location inference [37], [38]. However, a limitation of existing CNN-based visual localization methods is that they do not consider the context of the scene.…”
Section: Related Workmentioning
confidence: 99%