ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9413860
|View full text |Cite
|
Sign up to set email alerts
|

Progressive Spatio-Temporal Graph Convolutional Network for Skeleton-Based Human Action Recognition

Abstract: Graph convolutional networks have been very successful in skeletonbased human action recognition where the sequence of skeletons is modeled as a graph. However, most of the graph convolutional network-based methods in this area train a deep feed-forward network with a fixed topology that leads to high computational complexity and restricts their application in low computation scenarios. In this paper, we propose a method to automatically find a compact and problem-specific topology for spatio-temporal graph co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 16 publications
0
4
0
Order By: Relevance
“…GCN-based models for skeleton-based action recognition [15,16,18,22,23,27,28] operate on sequences of skeleton graphs. The spatio-temporal graph of skeletons G = (V, E) has the human body joint coordinates as nodes V and the spatial and temporal connections between them as edges E. Figure 2 (right) illustrates such a spatio-temporal graph where the spatial graph edges encode the human bones and the temporal edges connect the same joints in subsequent time-steps.…”
Section: A Spatio-temporal Graph Convolutional Networkmentioning
confidence: 99%
See 1 more Smart Citation
“…GCN-based models for skeleton-based action recognition [15,16,18,22,23,27,28] operate on sequences of skeleton graphs. The spatio-temporal graph of skeletons G = (V, E) has the human body joint coordinates as nodes V and the spatial and temporal connections between them as edges E. Figure 2 (right) illustrates such a spatio-temporal graph where the spatial graph edges encode the human bones and the temporal edges connect the same joints in subsequent time-steps.…”
Section: A Spatio-temporal Graph Convolutional Networkmentioning
confidence: 99%
“…Unfortunately, the high computational complexity of these GCN-based methods makes them infeasible in real-time applications and resource-constrained online inference settings. Multiple approaches have been explored to increase the efficiency of skeleton-based action recognition recently: GCN-NAS [22] and PST-GCN [23] are neural architecture search based methods which try to find an optimized ST-GCN architecture to increase the efficiency of the classification task; ShiftGCN [24] replaces graph and temporal convolutions with a zero-FLOPs shift graph operation and pointwise convolutions as an efficient alternative to the featurepropagation rule for GCNs [25]; ShiftGCN++ [26] boost the efficiency of ShiftGCN further via progressive architecture search, knowledge-distillation, explicit spatial positional encodings, and a Dynamic Shift Graph Convolution; SGN [27] utilizes semantic information such as joint type and frame index as side information to design a compact semanticsguided neural network (SGN) for capturing both spatial and temporal correlations in joint and frame level; TA-GCN [28] tries to make inference more efficient by selecting a subset of key skeletons, which hold the most important features for action recognition, from a sequence to be processed by the spatio-temporal convolutions.…”
Section: Introductionmentioning
confidence: 99%
“…Methods utilizing GCNs obviously need the movement represented as a graph. Popular encodings are spatiotemporal graphs [ 150 , 151 , 152 ]. Usually, the graph structure is a description of the skeleton structure, where each node represents a joint, and the edges indicate that two joints are connected by a limb.…”
Section: Machine Learning Algorithms For Human Motion Analysismentioning
confidence: 99%
“…There have been different approaches to reduce computational complexity when training deep neural networks, such as designing novel low-complexity network architectures (Kiranyaz et al, 2017;Tran et al, 2019c;Tran & Iosifidis, 2019;Tran et al, 2020;Kiranyaz et al, 2020;Heidari & Iosifidis, 2020), replacing existing ones with their low-rank counterparts (Denton et al, 2014;Jaderberg et al, 2014;Tran et al, 2018;Huang & Yu, 2018;Ruan et al, 2020), or adapting the pre-trained models to new tasks, i.e., performing Transfer Learning (TL) (Shao et al, 2014;Yang et al, 2015;Ding et al, 2016;Ding & Fu, 2018;Fons et al, 2020) or Domain Adaptation (DA) learning (Duan et al, 2012;Wang et al, 2019;Zhao et al, 2020;Hedegaard et al, 2021). Among these approaches, model adaptation is the most versatile since a method in this category is often architecture-agnostic, being complementary to other approaches.…”
Section: Introductionmentioning
confidence: 99%