2022
DOI: 10.48550/arxiv.2203.11009
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Online Skeleton-based Action Recognition with Continual Spatio-Temporal Graph Convolutional Networks

Abstract: Graph-based reasoning over skeleton data has emerged as a promising approach for human action recognition. However, the application of prior graph-based methods, which predominantly employ whole temporal sequences as their input, to the setting of online inference entails considerable computational redundancy. In this paper, we tackle this issue by reformulating the Spatio-Temporal Graph Convolutional Neural Network as a Continual Inference Network, which can perform step-by-step predictions in time without re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…Through a reformulation of the 3D convolution to compute inputs step-by-step rather than spatio-temporally, well-performing 3D CNNs such as X3D (Feichtenhofer, 2020), Slow (Feichtenhofer et al, 2019, and I3D (Carreira & Zisserman, 2017) trained for Trimmed Activity Recognition were re-implemented to execute step-by-step without any re-training. Likewise, Spatio-temporal Graph Convolutional Networks for Skeleton-based Action Recognition (Yan et al, 2018;Shi et al, 2019;Plizzari et al, 2021), which originally operated only on batches, were recently transformed to perform step-wise inference as well though a continual formulation of their Spatiotemporal Graph Convolution blocks (Hedegaard et al, 2022b).…”
Section: Continual Inference Networkmentioning
confidence: 99%
See 3 more Smart Citations
“…Through a reformulation of the 3D convolution to compute inputs step-by-step rather than spatio-temporally, well-performing 3D CNNs such as X3D (Feichtenhofer, 2020), Slow (Feichtenhofer et al, 2019, and I3D (Carreira & Zisserman, 2017) trained for Trimmed Activity Recognition were re-implemented to execute step-by-step without any re-training. Likewise, Spatio-temporal Graph Convolutional Networks for Skeleton-based Action Recognition (Yan et al, 2018;Shi et al, 2019;Plizzari et al, 2021), which originally operated only on batches, were recently transformed to perform step-wise inference as well though a continual formulation of their Spatiotemporal Graph Convolution blocks (Hedegaard et al, 2022b).…”
Section: Continual Inference Networkmentioning
confidence: 99%
“…In general, non-continual networks, which are transformed to continual ones attain reductions in per-step computational complexity in proportion to the temporal receptive field of the network. In some cases, these savings can amount to multiple orders of magnitude (Hedegaard et al, 2022b). Still, the implementation of Continual Inference Networks with temporal convolutions and Multi-head Attention in frameworks such as PyTorch (Paszke et al, 2019) requires deep knowledge and practical experience with CINs.…”
Section: Continual Inference Networkmentioning
confidence: 99%
See 2 more Smart Citations
“…Heidari and Iosifidis 20 proposed the TA-GCN model to select the key bones most conducive to activity recognition to perform spatiotemporal convolution operations on the skeleton sequence. In order to realize online human behavior recognition, Hedegaard and Heidari et al 21 proposed Continual Spatio-Temporal Graph Convolutional Network (CoST-GCN), which reorganized the spatio-temporal graph convolutional neural network as a continuous inference network, which was processed without frame repetition. In this case, the step-by-step prediction function on the time axis is implemented.…”
Section: Introductionmentioning
confidence: 99%