2018
DOI: 10.1007/s11263-018-1121-3
|View full text |Cite
|
Sign up to set email alerts
|

Deep Sign: Enabling Robust Statistical Continuous Sign Language Recognition via Hybrid CNN-HMMs

Abstract: This manuscript introduces the end-to-end embedding of a CNN into a HMM, while interpreting the outputs of the CNN in a Bayesian framework. The hybrid CNN-HMM combines the strong discriminative abilities of CNNs with the sequence modelling capabilities of HMMs. Most current approaches in the field of gesture and sign language recognition disregard the necessity of dealing with sequence data both for training and evaluation. With our presented end-to-end embedding we are able to improve over the state-of-the-ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
75
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2
2

Relationship

2
7

Authors

Journals

citations
Cited by 157 publications
(87 citation statements)
references
References 43 publications
(55 reference statements)
0
75
0
Order By: Relevance
“…Some use colored gloves to ease hand and finger tracking [26]. Recent advances in machine learning -i.e., deep learning and convolutional neural networks (CNNs) -have improved state-of-the-art computer vision approaches [76], though lack of sufficient training data currently limits the use of modern Artificial Intelligence (AI) techniques in this problem space.…”
Section: Recognition and Computer Visionmentioning
confidence: 99%
“…Some use colored gloves to ease hand and finger tracking [26]. Recent advances in machine learning -i.e., deep learning and convolutional neural networks (CNNs) -have improved state-of-the-art computer vision approaches [76], though lack of sufficient training data currently limits the use of modern Artificial Intelligence (AI) techniques in this problem space.…”
Section: Recognition and Computer Visionmentioning
confidence: 99%
“…Huang et al [15] learn a hand detector based on Faster R-CNN [33] using manually annotated signing hand bounding boxes, and apply it to general sign language recognition. Some sign language recognition approaches use no hand or pose preprocessing as a separate step (e.g., [22]), and indeed many signs involve large motions that do not require fine-grained gesture understanding. However, for fingerspelling recognition it is particularly important to understand fine-grained distinctions in handshape.…”
Section: Related Workmentioning
confidence: 99%
“…To tackle the problem, we exploit weak labels covering three modalities, namely gesture, mouth shape and hand shape and exploit the fact that all three contain sequential information with loose time synchronisation with respect to each other. We extend our previous work on hybrid HMM modelling for sign language recognition [3] [4] [5] by adding multi-stream HMMs with synchronisation constraints. The hybrid HMM modelling has shown to outperform other sequence learning approaches on sign language recognition data sets while requiring less memory and allowing for deeper architectures [4].…”
Section: Related Workmentioning
confidence: 99%