Video-based isolated hand sign language recognition using a deep cascaded model

Rastgoo, Razieh; Kiani, Kourosh; Escalera, Sérgio

doi:10.1007/s11042-020-09048-5

Cited by 61 publications

(31 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On the other hand, some studies use the combination of deep learning and traditional methods. Rastgoo et al [18] used some handcrafted features and 2D-CNNs to obtain spatial information.…”

Section: Related Workmentioning

confidence: 99%

Using Motion History Images With 3D Convolutional Networks in Isolated Sign Language Recognition

Sincan

Keleş

2022

IEEE Access

View full text Add to dashboard Cite

Sign language recognition using computational models is a challenging problem that requires simultaneous spatio-temporal modeling of the multiple sources, i.e. faces, hands, body, etc. In this paper, we propose an isolated sign language recognition model based on a model trained using Motion History Images (MHI) that are generated from RGB video frames. RGB-MHI images represent spatio-temporal summary of each sign video effectively in a single RGB image. We propose two different approaches using this RGB-MHI model. In the first approach, we use the RGB-MHI model as a motion-based spatial attention module integrated into a 3D-CNN architecture. In the second approach, we use RGB-MHI model features directly with the features of a 3D-CNN model using a late fusion technique. We perform extensive experiments on two recently released large-scale isolated sign language datasets, namely AUTSL and BosphorusSign22k. Our experiments show that our models, which use only RGB data, can compete with the state-of-the-art models in the literature that use multi-modal data.

show abstract

“…On the other hand, some studies use the combination of deep learning and traditional methods. Rastgoo et al [18] used some handcrafted features and 2D-CNNs to obtain spatial information.…”

Section: Related Workmentioning

confidence: 99%

Using Motion History Images With 3D Convolutional Networks in Isolated Sign Language Recognition

Sincan

Keleş

2022

IEEE Access

View full text Add to dashboard Cite

show abstract

“…They obtained an accuracy of 92.60% when using only 2D-CNN, 97.30% when using only 3D-CNN, and 99.20% when using only the fusion model. Rastgoo et al [16] presented a deep-based model for effective hand sign recognition by training: first, the single shot detector (SSD) model for hand identification using annotated videos of five online sign dictionaries, and second, a combinational model with a CNN and different spatial features. They performed a comprehensive study of sequence learning utilizing various pre-train models, spatial features, and temporal-based models.…”

Section: Related Workmentioning

confidence: 99%

Dynamic hand gesture recognition of Arabic sign language by using deep convolutional neural networks

Ismail

Dawwd²,

Ali³

2022

IJEECS

View full text Add to dashboard Cite

<p>In computer vision, one of the most difficult problems is human gestures in videos recognition Because of certain irrelevant environmental variables. This issue has been solved by using single deep networks to learn spatiotemporal characteristics from video data, and this approach is still insufficient to handle both problems at the same time. As a result, the researchers fused various models to allow for the effective collection of important shape information as well as precise spatiotemporal variation of gestures. In this study, we collected the dynamic dataset for twenty meaningful words of Arabic sign language (ArSL) using a Microsoft Kinect v2 camera. The recorded data included 7350 red, green, and blue (RGB) videos and 7350 depth videos. We proposed four deep neural networks models using 2D and 3D convolutional neural network (CNN) to cover all feature extraction methods and then passing these features to the recurrent neural network (RNN) for sequence classification. Long short-term memory (LSTM) and gated recurrent unit (GRU) are two types of using RNN. Also, the research included evaluation fusion techniques for several types of multiple models. The experiment results show the best multi-model for the dynamic dataset of the ArSL recognition achieved 100% accuracy.</p>

show abstract

“…Isolated sign language recognition refers to the task of accurately detecting single sign gestures from videos and thus it is usually tackled similar to action and gesture recognition, as well as other types of video processing and classification tasks with the extraction and learning of highly discriminative features [ 63 , 64 , 65 ]. In the literature, a common approach to the task of isolated sign language recognition is the extraction of hand and mouth regions from the video sequences in an attempt to remove noisy backgrounds that can inhibit classification performance.…”

Section: Sign Language Recognitionmentioning

confidence: 99%

“…In an effort to derive more discriminative features, Rastgoo et al in [ 63 ], proposed a multi-stream SLR method that gets as input hand regions, 3D hand pose features and Extra Spatial Hand Relation features (i.e., orientation and slope of hands). These features were concatenated and fed to an LSTM layer to derive the sign class.…”

Section: Sign Language Recognitionmentioning

confidence: 99%

Artificial Intelligence Technologies for Sign Language

Papastratis

Chatzikonstantinou

Konstantinidis

et al. 2021

Sensors

View full text Add to dashboard Cite

AI technologies can play an important role in breaking down the communication barriers of deaf or hearing-impaired people with other communities, contributing significantly to their social inclusion. Recent advances in both sensing technologies and AI algorithms have paved the way for the development of various applications aiming at fulfilling the needs of deaf and hearing-impaired communities. To this end, this survey aims to provide a comprehensive review of state-of-the-art methods in sign language capturing, recognition, translation and representation, pinpointing their advantages and limitations. In addition, the survey presents a number of applications, while it discusses the main challenges in the field of sign language technologies. Future research direction are also proposed in order to assist prospective researchers towards further advancing the field.

show abstract

Video-based isolated hand sign language recognition using a deep cascaded model

Cited by 61 publications

References 26 publications

Using Motion History Images With 3D Convolutional Networks in Isolated Sign Language Recognition

Using Motion History Images With 3D Convolutional Networks in Isolated Sign Language Recognition

Dynamic hand gesture recognition of Arabic sign language by using deep convolutional neural networks

Artificial Intelligence Technologies for Sign Language

Contact Info

Product

Resources

About