2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2017
DOI: 10.1109/cvprw.2017.267
|View full text |Cite
|
Sign up to set email alerts
|

Okutama-Action: An Aerial View Video Dataset for Concurrent Human Action Detection

Abstract: Despite significant progress in the development of human action detection datasets and algorithms, no current dataset is representative of real-world aerial view scenarios. We present Okutama-Action, a new video dataset for aerial view concurrent human action detection. It consists of 43 minute-long fully-annotated sequences with 12 action classes. Okutama-Action features many challenges missing in current datasets, including dynamic transition of actions, significant changes in scale and aspect ratio, abrupt … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
123
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 165 publications
(130 citation statements)
references
References 30 publications
0
123
0
Order By: Relevance
“…pedestrians, bikers, cars and buses, to understand pedestrian trajectory and their interact with the physical space as well as with the targets that populate such spaces. This could make a great contribution to pedestrian tracking, target trajectory prediction and activity understanding [238]. In [186], researchers adopt a camera-equipped UAV to record naturalistic vehicle trajectory and naturalistic behavior of road users, which is intended for scenario-based safety validation of highly automated vehicles.…”
Section: Human and Social Understandingmentioning
confidence: 99%
“…pedestrians, bikers, cars and buses, to understand pedestrian trajectory and their interact with the physical space as well as with the targets that populate such spaces. This could make a great contribution to pedestrian tracking, target trajectory prediction and activity understanding [238]. In [186], researchers adopt a camera-equipped UAV to record naturalistic vehicle trajectory and naturalistic behavior of road users, which is intended for scenario-based safety validation of highly automated vehicles.…”
Section: Human and Social Understandingmentioning
confidence: 99%
“…The state of the art deep detectors like SSD, due to the limited computational resources, cannot be trained on high resolution images, therefore, the aerial action detection seems impossible using these types of networks. Authors in [4] show that the SSD512 can detect pedestrian (not their actions) with the good accuracy of Mean Average Precision (mAP@0.5) of 72.3% which is as good as the performance of SSD on the frontal-view datasets like VOC2017 [7]. However, using even a larger input of 960x540, they only report mAP=18.18% for detecting pedestrians' actions.…”
Section: Proposed Methodsmentioning
confidence: 99%
“…For instance, in an urgent situation, on the basis of people's tweets, it might be very important to search for a person who is running and carrying something in a street. In this paper, we assume that objects of interest can be discriminated based on their single or multiple actions, and evaluate the proposed framework on the Okutama-Action dataset [4], which is an aerial dataset for concurrent human single and multiple action detection.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…We propose a framework in which, first, an SSD [3], which has shown promising performances in the aerial image object detection literature [2] and [17], generates a number of objects of interest proposals for an input aerial image. These proposals might contain vehicle, background, or other objects.…”
Section: Proposed Methodsmentioning
confidence: 99%