Motion Representation with Acceleration Images

Kataoka, Hirokatsu; He, Yun; Shirakabe, Soma; Satoh, Yutaka

doi:10.1007/978-3-319-49409-8_3

Cited by 6 publications

(4 citation statements)

References 20 publications

(27 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Particularly, the three actions of "turning", " straight walking", and "crossing" are gaining fine-grained recognition due to their comparable form, changing pedestrian size, and inclusion in the same category as "walking. H. Kataoka et al [36] proposed a basic way for illustrating a change in a flow image using "acceleration images." These pictures should be significant since their representation differs from that of location (RGB) and velocity (flow) images.…”

Section: ) Ntsel Datasetmentioning

confidence: 99%

Pedestrian Attributes and Activity Recognition Using Deep Learning: A Comprehensive Survey

2023

IJSER

View full text Add to dashboard Cite

In recent years, pedestrian attributes and activity recognition has attracted expanding research emphasis due to their considerable research importance and value of application in the intelligent civil and military domains. Owing to the inadequate image or frame quality of inexpensive cameras and the absence of obvious and stable feature information, direction of pedestrian movement, and so on, the complication of pedestrian attributes and activity recognition is expanded. However, with the comprehensive implementation of deep learning techniques, pedestrian attributes and activity recognition has made a substantial advance. This paper is the first of its kind because it combines the recognition of pedestrian attributes and activities separately, and reviews the works on pedestrian attributes and activities using deep learning in relation to datasets. The fundamental concepts, corresponding challenges, and popular solutions are also explained. Furthermore, in this community, metrics of evaluation and concise performance comparisons are given. In the end, the hotspots of the present research and the directions of the future research are summarized.

show abstract

Section: ) Ntsel Datasetmentioning

confidence: 99%

Pedestrian Attributes and Activity Recognition Using Deep Learning: A Comprehensive Survey

2023

IJSER

View full text Add to dashboard Cite

show abstract

“…Simonyan and Zisserman (Simonyan and Zisserman 2014) propose a twostream framework which uses two ConvNets to respectively extract features from two information streams (i.e., appearance and motion) and fuse them for recognition. Based on this framework, recent researches further improve the effectiveness of ConvNet features by including additional information sources (Shi et al 2017;Kataoka et al 2016), selecting spatial-temporal attention parts (Kar et al 2017;Sharma, Kiros, and Salakhutdinov 2015;Zhu et al 2016), or incorporating more proper temporal information (Wang et al 2016b;Wu et al 2015;Cherian et al 2017;Bilen et al 2016).…”

Section: Related Workmentioning

confidence: 99%

“…(Best viewed in color) less progress in action recognition due to the high complexity of video data. Some recent studies attempted to improve the deep feature representation of an action by including additional information sources (Duta et al 2017;Shi et al 2017;Kataoka et al 2016), selecting spatialtemporal attention parts (Kar et al 2017;Sharma, Kiros, and Salakhutdinov 2015;Zhu et al 2016), or incorporating more proper temporal information (Wang et al 2016b;Cherian et al 2017). However, since most of them focus on learning features to directly describe actions' individual action classes, they have limitations in precisely differentiating the ambiguity among action classes due to the large intraclass variations and subtle inter-class differences of actions.…”

Section: Introductionmentioning

confidence: 99%

“…Second, combining multiple information streams (such as two-stream ConvNets (Simonyan and Zisserman 2014)) has shown strong performance and thus has become a mainstream framework in action recognition. However, most existing works only focus on introducing more information streams (Shi et al 2017;Kataoka et al 2016) or strengthening the correlation among streams (Wang et al 2016b;Wu et al 2015;Sun et al 2017), while the asynchronous issue among different information streams is less studied.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Action Recognition With Coarse-to-Fine Deep Feature Integration and Asynchronous Fusion

Lin

Zhang

Lü

et al. 2018

AAAI

View full text Add to dashboard Cite

Action recognition is an important yet challenging task in computer vision. In this paper, we propose a novel deep-based framework for action recognition, which improves the recognition accuracy by: 1) deriving more precise features for representing actions, and 2) reducing the asynchrony between different information streams. We first introduce a coarse-to-fine network which extracts shared deep features at different action class granularities and progressively integrates them to obtain a more accurate feature representation for input actions. We further introduce an asynchronous fusion network. It fuses information from different streams by asynchronously integrating stream-wise features at different time points, hence better leveraging the complementary information in different streams. Experimental results on action recognition benchmarks demonstrate that our approach achieves the state-of-the-art performance.

show abstract