2012 IEEE Conference on Computer Vision and Pattern Recognition 2012
DOI: 10.1109/cvpr.2012.6247806
|View full text |Cite
|
Sign up to set email alerts
|

Action bank: A high-level representation of activity in video

Abstract: Activity recognition in video is dominated by low-and mid-level features, and while demonstrably capable, by nature, these features carry little semantic meaning. Inspired by the recent object bank approach to image representation, we present Action Bank, a new high-level representation of video. Action bank is comprised of many individual action detectors sampled broadly in semantic space as well as viewpoint space. Our representation is constructed to be semantically rich and even when paired with simple lin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
476
2
1

Year Published

2013
2013
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 612 publications
(480 citation statements)
references
References 36 publications
1
476
2
1
Order By: Relevance
“…Since Sadanand and Corso [5] use richer representation for videos, they report a recognition accuracy of 98% on KTH actions dataset. However, there are three main advantages of our method compared to [5].…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…Since Sadanand and Corso [5] use richer representation for videos, they report a recognition accuracy of 98% on KTH actions dataset. However, there are three main advantages of our method compared to [5].…”
Section: Resultsmentioning
confidence: 99%
“…However, there are three main advantages of our method compared to [5]. First, our method automatically reports the high-level features (camera position features) and provides a semantic description for videos.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…However, they are not directly associated with persons, zones in the scene, a person's activities, and what happens during a person's trajectory. Such associations are not trivial: many of the well-performing methods consider the low-level features in the whole video [3], the whole scene [4], or in sub-volumes without making explicit associations [5]. Recent attempts for complex behaviours in complex scenes have not been successful yet [6], although reasonable performance have been reported for simple activities [7].…”
Section: Introductionmentioning
confidence: 99%