Alex C. Kot scite author profile

Research on depth-based human activity analysis achieved outstanding performance and demonstrated the effectiveness of 3D representation for action recognition. The existing depth-based and RGB+D-based action recognition benchmarks have a number of limitations, including the lack of large-scale training samples, realistic number of distinct class categories, diversity in camera views, varied environmental conditions, and variety of human subjects. In this work, we introduce a large-scale dataset for RGB+D human action recognition, which is collected from 106 distinct subjects and contains more than 114 thousand video samples and 8 million frames. This dataset contains 120 different action classes including daily, mutual, and health-related activities. We evaluate the performance of a series of existing 3D activity analysis methods on this dataset, and show the advantage of applying deep learning methods for 3D-based human action recognition. Furthermore, we investigate a novel one-shot 3D activity recognition problem on our dataset, and a simple yet effective Action-Part Semantic Relevance-aware (APSR) framework is proposed for this task, which yields promising results for recognition of the novel action classes. We believe the introduction of this large-scale dataset will enable the community to apply, adapt, and develop various data-hungry learning techniques for depth-based and RGB+D-based human activity understanding. [The dataset is available at: http:// rose1.ntu.edu.sg/ Datasets/ actionRecognition.asp.]

show abstract

Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates

Liu

Shahroudy

et al. 2018

IEEE Trans. Pattern Anal. Mach. Intell.

426

244

View full text Add to dashboard Cite

Abstract-Skeleton-based human action recognition has attracted a lot of research attention during the past few years. Recent works attempted to utilize recurrent neural networks to model the temporal dependencies between the 3D positional configurations of human body joints for better analysis of human activities in the skeletal data. The proposed work extends this idea to spatial domain as well as temporal domain to better analyze the hidden sources of action-related information within the human skeleton sequences in both of these domains simultaneously. Based on the pictorial structure of Kinect's skeletal data, an effective tree-structure based traversal framework is also proposed. In order to deal with the noise in the skeletal data, a new gating mechanism within LSTM module is introduced, with which the network can learn the reliability of the sequential data and accordingly adjust the effect of the input data on the updating procedure of the long-term context representation stored in the unit's memory cell. Moreover, we introduce a novel multi-modal feature fusion strategy within the LSTM unit in this paper. The comprehensive experimental results on seven challenging benchmark datasets for human action recognition demonstrate the effectiveness of the proposed method.

show abstract

Unsupervised Domain Adaptation for Face Anti-Spoofing

Cao³

et al. 2018

IEEE Trans.Inform.Forensic Secur.

214

140

View full text Add to dashboard Cite

Deep Coupled ResNet for Low-Resolution Face Recognition

Jiang

Kot

2018

IEEE Signal Process. Lett.

222

110

View full text Add to dashboard Cite

Learning Generalized Deep Feature Representation for Face Anti-Spoofing

Wang

et al. 2018

IEEE Trans.Inform.Forensic Secur.

171

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alex C. Kot

NTU RGB+D 120: A Large-Scale Benchmark for 3D Human Activity Understanding

Skeleton-Based Action Recognition Using Spatio-Temporal LSTM Network with Trust Gates

Unsupervised Domain Adaptation for Face Anti-Spoofing

Deep Coupled ResNet for Low-Resolution Face Recognition

Learning Generalized Deep Feature Representation for Face Anti-Spoofing

Contact Info

Product

Resources

About