2019
DOI: 10.1007/978-3-030-20257-6_44
|View full text |Cite
|
Sign up to set email alerts
|

Recognizing Human Actions Using 3D Skeletal Information and CNNs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(6 citation statements)
references
References 17 publications
0
6
0
Order By: Relevance
“…It should be intuitive that (a) different activities require different amounts of time and (b) the same activity requires different amounts of time both in the cases it is performed by different subjects and even when it is performed by the same subject. As in [ 20 ], we used a linear interpolation step, setting the number of frames F as equal for all activity examples. Upon performing several experiments, we set .…”
Section: Proposed Methodologymentioning
confidence: 99%
See 2 more Smart Citations
“…It should be intuitive that (a) different activities require different amounts of time and (b) the same activity requires different amounts of time both in the cases it is performed by different subjects and even when it is performed by the same subject. As in [ 20 ], we used a linear interpolation step, setting the number of frames F as equal for all activity examples. Upon performing several experiments, we set .…”
Section: Proposed Methodologymentioning
confidence: 99%
“…For classification, we used a Convolutional Neural Network (CNN). Specifically, the architecture of the CNN that has been used throughout our experiments has been experimentally defined and was initially used in previous works [ 20 , 34 ]. It consists of a 2D convolutional layer that filters the input image with five kernels of size, a max-pooling layer that performs subsampling, two consecutive convolutional layers of size with 10 and 15 kernels, a max-pooling layer performing subsampling, a flattened layer that transforms the output of the last pooling layer into a vector, which consists of the input to a dense layer upon applying a dropout layer with a dropout rate equal to and a second dense layer producing the output of the network.…”
Section: Proposed Methodologymentioning
confidence: 99%
See 1 more Smart Citation
“…Annotated data are usually pre-processed with several cleaning methodologies prior to being used as input for an ML algorithm. This step may include, e.g., treating actions as signals and then using signal processing techniques to transform them into images [ 21 , 22 ], utilizing low-resolution RGB frames or cropping the central area of the frames [ 23 ] or even considering short- and long-term dependencies based on depth [ 24 ]. Then, ML/DL algorithms are applied to those data for action recognition.…”
Section: Related Workmentioning
confidence: 99%
“…There have always been many convoluted image-based visual tasks to settle in computer vision [49][50][51], such as image retrieval, image classification, semantic segmentation, image captioning, etc. However, the emergence of innovative computational models and learning algorithms has provided new approaches for solving those highly demanding tasks.…”
Section: The Proposed Research Frameworkmentioning
confidence: 99%