2017 12th IEEE International Conference on Automatic Face &Amp; Gesture Recognition (FG 2017) 2017
DOI: 10.1109/fg.2017.150
|View full text |Cite
|
Sign up to set email alerts
|

A Survey on Deep Learning Based Approaches for Action and Gesture Recognition in Image Sequences

Abstract: The interest in action and gesture recognition has grown considerably in the last years. In this paper, we present a survey on current deep learning methodologies for action and gesture recognition in image sequences. We introduce a taxonomy that summarizes important aspects of deep learning for approaching both tasks. We review the details of the proposed architectures, fusion strategies, main datasets, and competitions. We summarize and discuss the main works proposed so far with particular interest on how t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
89
0
3

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 165 publications
(92 citation statements)
references
References 97 publications
(128 reference statements)
0
89
0
3
Order By: Relevance
“…Our method is inspired from [29]. Layer 2 extracts the local phase spectra of f (x) by computing the 3D Short Term Fourier Transform (STFT) in a local n×n×n neighborhood N x at each position x of f (x) using Equation 1.…”
Section: Methodsmentioning
confidence: 99%
“…Our method is inspired from [29]. Layer 2 extracts the local phase spectra of f (x) by computing the 3D Short Term Fourier Transform (STFT) in a local n×n×n neighborhood N x at each position x of f (x) using Equation 1.…”
Section: Methodsmentioning
confidence: 99%
“…CNNs have been used to classify actions and interactions in single frames [4,9,42]. Similar to the use of handcrafted features (Section 3), the focus is on a characteristic joint pose.…”
Section: Single Frame Networkmentioning
confidence: 99%
“…During the back-propagation, due to its property of differentiability, it updates the gradient. The corresponding mask gradient of the input feature in the soft mask layer is as shown in equation (2). If the trunk features T are not correct, mask can prevent [54] T features to update the parameters as there is a multiplication factor of the mask M with partial derivative of T as shown in equation (2).…”
Section: D Residual Attention Networkmentioning
confidence: 99%