2016
DOI: 10.1109/tpami.2016.2537340
|View full text |Cite
|
Sign up to set email alerts
|

Deep Dynamic Neural Networks for Multimodal Gesture Segmentation and Recognition

Abstract: This paper describes a novel method called Deep Dynamic Neural Networks (DDNN) for multimodal gesture recognition. A semi-supervised hierarchical dynamic framework based on a Hidden Markov Model (HMM) is proposed for simultaneous gesture segmentation and recognition where skeleton joint information, depth and RGB images, are the multimodal input observations. Unlike most traditional approaches that rely on the construction of complex handcrafted features, our approach learns high-level spatio-temporal represen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
198
0
4

Year Published

2017
2017
2023
2023

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 401 publications
(209 citation statements)
references
References 52 publications
(82 reference statements)
0
198
0
4
Order By: Relevance
“…Others, such as [10], [11], [25], [26], [27], accept a more relaxed hash coding restriction that heterogeneous data representing common objects share similar binary codes which means the Hamming distance of their binary codes, should be small enough. Some other interesting methods could be found [23], [28], [29], [30], [31], [32].…”
Section: Related Workmentioning
confidence: 99%
“…Others, such as [10], [11], [25], [26], [27], accept a more relaxed hash coding restriction that heterogeneous data representing common objects share similar binary codes which means the Hamming distance of their binary codes, should be small enough. Some other interesting methods could be found [23], [28], [29], [30], [31], [32].…”
Section: Related Workmentioning
confidence: 99%
“…The authors found this feature learning algorithm is surprisingly successful an applied to detect image objects. The authors in the paper [2] describe the hundreds of thousands of unlabelled videos from the web to learn visual representation of those videos it helps tracking visually provides the super vision that means two patches connected by a track should have similar visual representation in deep feature space since they probably using deep dynamic neural networks for multimodal gesture segmentation and recognition [3]. The author proposed semi supervised hierarchical dynamic framework based on Hidden Markov model (HMM) for simultaneous gesture segmentation and recognition where skeleton joint information, depth and RGB images, are the multimodal input observation.…”
Section: Literature Surveymentioning
confidence: 99%
“…This model will be used in the later stage for detection of objects and arriving features inside images uploaded by user.This paper utilizes python implementation [3] for CNN.…”
Section: B Creating Training Datamentioning
confidence: 99%
“…Yan and Shao [38] also used deep learning technique to estimate image blur blindly. For more reference about the application of deep learning, see [39], [40] Taking advantages of deep learning, these image based algorithms offer more promising superresolution estimations than most of patch based algorithms. However, the huge burden of training a convolutional neural network makes these image based algorithms time-consuming during the training process.…”
Section: A Brief Review On Single-image Super-resolutionmentioning
confidence: 99%