Pose Encoding for Robust Skeleton-Based Action Recognition

Demisse, Girum G.; Papadopoulos, Konstantinos; Aouada, Djamila; Ottersten, Björn

doi:10.1109/cvprw.2018.00056

Cited by 35 publications

(29 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As future work, we intend to explore in more detail the properties of the deformation-based alignment in motion analysis. For that purpose, a penalty term would be added in the minimization problem, such that it would incorporate problemrelated constraints, e.g., 3D skeleton-specific geometry or noisefree skeleton dynamics [30]. We also intend to expand the proposed approach to represent a skeleton sequence as a point in the deformation space, without any prior knowledge.…”

Section: Discussionmentioning

confidence: 99%

Deformation-Based Abnormal Motion Detection using 3D Skeletons

Baptista

Demisse

Aouada

et al. 2018

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Self Cite

View full text Add to dashboard Cite

In this paper, we propose a system for abnormal motion detection using 3D skeleton information, where the abnormal motion is not known a priori. To that end, we present a curve-based representation of a sequence, based on few joints of a 3D skeleton, and a deformation-based distance function. We further introduce a time-variation model that is specifically designed for assessing the quality of a motion; we refer to a distance function that is based on such a model as motion quality distance. The overall advantages of the proposed approach are 1) lower dimensional yet representative sequence representation and 2) a distance function that emphasizes time variation, the motion quality distance, which is a particularly important property for quality assessment. We validate our approach using a publicly available dataset, SPHERE-StairCase2014 dataset. Qualitative and quantitative results show promising performance.

show abstract

Section: Discussionmentioning

confidence: 99%

Deformation-Based Abnormal Motion Detection using 3D Skeletons

Baptista

Demisse

Aouada

et al. 2018

2018 Eighth International Conference on Image Processing Theory, Tools and Applications (IPTA)

Self Cite

View full text Add to dashboard Cite

show abstract

“…ii The biases of the gating units, g l , in G l , at layer l are initialized to negative values such as -1 or -3 [10] to ensure that most of the features learned at layer l, H(x) l , are untransformed features, H(x) l−1 . This is due to the gating units' activations, G l (H(x) l−1 ), being close to 0; see (1).…”

Section: A Background: Highway Networkmentioning

confidence: 95%

“…D eep neural networks have found applications in many real-life tasks; their successes for learning different difficult problems are well documented. In particular, the field of computer vision has hugely benefited from deep neural networks for various applications ranging from pose estimation [1], segmentation [2], action recognition [3], face recognition [4], etc. In recent times, there is a growing trend of using deep networks for learning directly from raw data (i.e.…”

Section: Introductionmentioning

confidence: 99%

“…Moreover, the difficulty of the problems that we intend to solve over time has consistently increased. For example, the MNIST dataset 1 that was considered very difficult many years ago, and thus used for benchmarking, is no longer considered as a hard dataset; many works [5], [6] have reported error rates in the range [0.5%, 0.21%]. However, tackling harder learning tasks has motivated a new direction for deep neural networks with many layers of feature representations.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Improved Highway Network Block for Training Very Deep Neural Networks

et al. 2020

Self Cite

View full text Add to dashboard Cite

Very deep networks are successful in various tasks with reported results surpassing human performance. However, training such very deep networks is not trivial. Typically, the problems of learning the identity function and feature reuse can work together to plague optimization of very deep networks. In this paper, we propose a highway network with gate constraints that addresses the aforementioned problems, and thus alleviates the difficulty of training. Namely, we propose two variants of highway network, HWGC and HWCC, employing feature summation and concatenation respectively. The proposed highway networks, besides being more computationally efficient, are shown to have more interesting learning characteristics such as natural learning of hierarchical and robust representations due to a more effective usage of model depth, fewer gates for successful learning, better generalization capacity and faster convergence than the original highway network. Experimental results show that our models outperform the original highway network and many state-of-the-art models. Importantly, we observe that our second model with feature concatenation and compression consistently outperforms our model with feature summation of similar depth, the original highway network, many state-of-the-art models and even ResNets on four benchmarking datasets which are CIFAR-10, CIFAR-100, Fashion-MNIST, SVHN and imagenet-2012 (ILSVRC) datasets. Furthermore, the second proposed model is more computationally efficient than the state-of-the-art in view of training, inference time and GPU memory resource, which strongly supports real-time applications. Using a similar number of model parameters for the CIFAR-10, CIFAR-100, Fashion-MNIST and SVHN datasets, the significantly shallower proposed model can surpass the performance of ResNet-110 and ResNet-164 that are roughly 6 and 8 times deeper, respectively. Similarly, for the imagenet dataset, the proposed models surpass the performance of ResNet-101 and ResNet-152 that are roughly three times deeper.

show abstract

“…TABLE IV A COMPARISON BETWEEN THE PROPOSED METHOD AND STATE-OF-THE-ART APPROACHES IN TERMS OF NORTH WESTERN UCLA DATASET. Paper Cross-subject Cross-view Virtual view [34] 50.70 47.80 Hankelet [35] 54.20 45.20 MST-AOG [26] 81.60 73.30 Action Bank [36] 24.60 17.60 Poselet [37] 54.90 24.50 Denoised-LSTM [38] -79.57 tLDS [39] 92 It can be seen in Table IV, Virtual view [34] and Hanklet [35] methods are limited in their performance which reflects the challenges of the North Western UCLA dataset (e.g. noise, cluttered backgrounds and various view points).…”

Section: A North Western Ucla Datasetmentioning

confidence: 99%

Multi-view region-adaptive multi-temporal DMM and RGB action recognition

Al-Faris

Chiverton

Yang

et al. 2020

Pattern Anal Applic

View full text Add to dashboard Cite

Human action recognition remains an important yet challenging task. This work proposes a novel action recognition system. It uses a novel Multiple View Region Adaptive Multi-resolution in time Depth Motion Map (MV-RAMDMM) formulation combined with appearance information. Multiple stream 3D Convolutional Neural Networks (CNNs) are trained on the different views and time resolutions of the region adaptive Depth Motion Maps. Multiple views are synthesised to enhance the view invariance. The region adaptive weights, based on localised motion, accentuate and differentiate parts of actions possessing faster motion. Dedicated 3D CNN streams for multi-time resolution appearance information (RGB) are also included. These help to identify and differentiate between small object interactions. A pre-trained 3D-CNN is used here with fine-tuning for each stream along with multiple class Support Vector Machines (SVM)s. Average score fusion is used on the output. The developed approach is capable of recognising both human action and human-object interaction. Three public domain datasets including: MSR 3D Action, Northwestern UCLA multi-view actions and MSR 3D daily activity are used to evaluate the proposed solution. The experimental results demonstrate the robustness of this approach compared with state-of-the-art algorithms.

show abstract

Pose Encoding for Robust Skeleton-Based Action Recognition

Cited by 35 publications

References 19 publications

Deformation-Based Abnormal Motion Detection using 3D Skeletons

Deformation-Based Abnormal Motion Detection using 3D Skeletons

Improved Highway Network Block for Training Very Deep Neural Networks

Multi-view region-adaptive multi-temporal DMM and RGB action recognition

Contact Info

Product

Resources

About