2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2016
DOI: 10.1109/cvpr.2016.290
|View full text |Cite
|
Sign up to set email alerts
|

Predicting the Where and What of Actors and Actions through Online Action Localization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
89
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 74 publications
(89 citation statements)
references
References 33 publications
0
89
0
Order By: Relevance
“…Li et al 's work [27] exploits sequence mining, where a series of actions and object co-occurrences are encoded as symbolic sequences. Soomro et al [43] propose to use binary SVMs to localize and classify video snippets into sub-action categories, and obtain the final class label in an online manner using dynamic programming. In [50], action prediction is approached using still images with action-scene correlations.…”
Section: Related Workmentioning
confidence: 99%
“…Li et al 's work [27] exploits sequence mining, where a series of actions and object co-occurrences are encoded as symbolic sequences. Soomro et al [43] propose to use binary SVMs to localize and classify video snippets into sub-action categories, and obtain the final class label in an online manner using dynamic programming. In [50], action prediction is approached using still images with action-scene correlations.…”
Section: Related Workmentioning
confidence: 99%
“…The idea of anticipation was introduced in the computer vision community almost a decade ago by [35]. While the early methods [34,40,39] relied on handcrafted-features, they have now been superseded by end-to-end learning methods [21,12,1], focusing on designing new losses better-suited to anticipation. In particular, the loss of [1] has proven highly effective, achieving state-of-the-art results on several standard benchmarks.…”
Section: Baseline Methodsmentioning
confidence: 99%
“…Metrics. Following [34,38,48], we utilize video mean Average Precision (mAP ) to evaluate action detection accuracy. We calculate an average of per-frame Intersectionover-Union (IoU) across time between tubes.…”
Section: Datasets Metrics and Implementationmentioning
confidence: 99%