CVPR 2011 2011
DOI: 10.1109/cvpr.2011.5995586
|View full text |Cite
|
Sign up to set email alerts
|

A large-scale benchmark dataset for event recognition in surveillance video

Abstract: We introduce a new large-scale video dataset designed to assess the performance of diverse visual event recognition algorithms with a focus on continuous visual event recognition (CVER) in outdoor areas with wide coverage. Previous datasets for action recognition are unrealistic for real-world surveillance because they consist of short clips showing one action by one individual [15,8]. Datasets have been developed for movies [11] and sports [12], but, these actions and scene conditions do not apply effectively… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
387
0
6

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 583 publications
(396 citation statements)
references
References 18 publications
1
387
0
6
Order By: Relevance
“…Popular datasets for this task include the Pascal dataset (4), the LabelMe dataset (24), and the Lotus Hill dataset (25), all populated by relatively unconstrained natural images, but varying considerably in size and in the level of annotation, ranging from a few keywords to hierarchical representations (Lotus Hill). Finally, a few other datasets have been assembled and annotated to evaluate the quality of detected object attributes such as color, orientation, and activity; examples are the Core dataset (26), with annotated object parts and attributes, and the Virat dataset (27) for event detection in videos.…”
Section: Current Evaluation Practicementioning
confidence: 99%
“…Popular datasets for this task include the Pascal dataset (4), the LabelMe dataset (24), and the Lotus Hill dataset (25), all populated by relatively unconstrained natural images, but varying considerably in size and in the level of annotation, ranging from a few keywords to hierarchical representations (Lotus Hill). Finally, a few other datasets have been assembled and annotated to evaluate the quality of detected object attributes such as color, orientation, and activity; examples are the Core dataset (26), with annotated object parts and attributes, and the Virat dataset (27) for event detection in videos.…”
Section: Current Evaluation Practicementioning
confidence: 99%
“…Four surveillance videos with a resolution of 1920 × 1080 and a length of one minute were used. They were taken from the VIRAT database, which was designed for performance assessment of activity detection algorithms [6]. Representative frames from the four scenarios are shown in Figure 3.…”
Section: Resultsmentioning
confidence: 99%
“…The baseline method used for comparison is very similar to the one used by the authors of [9]. This method is the usual BoW pipeline [10] where the STIPs are detected with 3-D Harris corners and HOG/HOF is used as descriptor.…”
Section: B Protocolmentioning
confidence: 99%