2019
DOI: 10.3390/s19081932
|View full text |Cite
|
Sign up to set email alerts
|

Spatio–Temporal Image Representation of 3D Skeletal Movements for View-Invariant Action Recognition with Deep Convolutional Neural Networks

Abstract: Designing motion representations for 3D human action recognition from skeleton sequences is an important yet challenging task. An effective representation should be robust to noise, invariant to viewpoint changes and result in a good performance with low-computational demand. Two main challenges in this task include how to efficiently represent spatio–temporal patterns of skeletal movements and how to learn their discriminative features for classification tasks. This paper presents a novel skeleton-based repre… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(10 citation statements)
references
References 89 publications
(214 reference statements)
0
9
0
Order By: Relevance
“…Hug et al, 2019 [35] applied an action recognition model based on the transformation of the skeleton to a spatial presentation using the conversion of the distance values of two joints to color points, and they used the DenseNet CNN model for action classification. We finish this section by presenting the major conclusions extracted by the two surveys of Wan et al, 2018 [36] and Pham et al, 2019 [37]. These authors found that methods based on human pose estimation and skeleton feature extraction can achieve higher classification rates.…”
Section: Related Workmentioning
confidence: 84%
“…Hug et al, 2019 [35] applied an action recognition model based on the transformation of the skeleton to a spatial presentation using the conversion of the distance values of two joints to color points, and they used the DenseNet CNN model for action classification. We finish this section by presenting the major conclusions extracted by the two surveys of Wan et al, 2018 [36] and Pham et al, 2019 [37]. These authors found that methods based on human pose estimation and skeleton feature extraction can achieve higher classification rates.…”
Section: Related Workmentioning
confidence: 84%
“…The method allows to eliminate the coordinate deviations caused by various recording environments and posture displacements. Pham et al (2019) exploit Deep CNNs based on the DenseNet model to learn directly an end-toend mapping between the input skeleton sequences and their action label for human activity recognition. The network learns spatio-temporal patterns of skeletal movements and their discriminative features for further downstream classification tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Hinton et al [54] define a DNN as a neural network that consists of two or more hidden layers between the input layer and the output layer. The literature contains a number of techniques that employ deep-learning-based approaches; for example, [1,[55][56][57] rely on a Convolutional Neural Network (CNN), [19,[58][59][60] utilize a Recurrent Neural Network (RNN), as well as Long-Short Term Memory (LSTM), and [61] uses Deep Progress Reinforcement Learning (DPRL). Sedmidubsky et al [1] propose a method for action recognition and segmentation in which the motions are mapped onto encoded RGB images.…”
Section: Deep-learning-based Approachesmentioning
confidence: 99%
“…They extract the most informative frames from the input action video sequences through DPRL and employ a graph-based CNN in order to exploit the extrinsic, as well as intrinsic, human joint dependencies. Pham et al [57] propose an SPMF (Skeleton Posture-Motion Feature) based on necessary spatiotemporal information extracted from skeleton poses and their motions in order to represent unique patterns that exist in skeletal movements. It is further enhanced by exploiting the Adaptive Histogram Equalization (AHE) method to build the action map.…”
Section: Deep-learning-based Approachesmentioning
confidence: 99%