2017
DOI: 10.1007/978-3-319-57021-1_19
|View full text |Cite
|
Sign up to set email alerts
|

Deep Learning for Action and Gesture Recognition in Image Sequences: A Survey

Abstract: Interest in automatic action and gesture recognition has grown considerably in the last few years. This is due in part to the large number of application domains for this type of technology. As in many other computer vision areas, deep learning based methods have quickly become a reference methodology for obtaining state-of-the-art performance in both tasks. This chapter is a survey of current deep learning based methodologies for action and gesture recognition in sequences of images. The survey reviews both f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 42 publications
(21 citation statements)
references
References 145 publications
0
19
0
Order By: Relevance
“…Moreover, deep facial recognition [45], gesture recognition [46], crowd detection [16], crowd behavior analysis [47], crime scene analysis [48], etc. using machine learning techniques have been subjects of great interest beyond computer science.…”
Section: The Smart City and Crimementioning
confidence: 99%
“…Moreover, deep facial recognition [45], gesture recognition [46], crowd detection [16], crowd behavior analysis [47], crime scene analysis [48], etc. using machine learning techniques have been subjects of great interest beyond computer science.…”
Section: The Smart City and Crimementioning
confidence: 99%
“…We also review some recent methods of feature augmentation. More comprehensive reviews on hand gesture recognition are found in [36,37,38,39].…”
Section: Related Workmentioning
confidence: 99%
“…In [17], deep architectures used for action recognition are categorized in four groups: 2D models, motion-based input features, 3D models and temporal networks. In the first group, [18] uses a pre-trained model on one or more frames which are sampled from the whole video.…”
Section: B Two-stream I3dmentioning
confidence: 99%
“…Therefore, we only focus on the cross-subject evaluation. In the cross-subject evaluation, samples of subjects 1,2,4,5,8,9,13,14,15,16,17,18,19,25,27,28,31,34,35 and 38 were used as training and samples of the remaining subjects were reserved for testing.…”
Section: A Datasetsmentioning
confidence: 99%