Automatic Recognition of Mexican Sign Language Using a Depth Camera and Recurrent Neural Networks

Mejía-Peréz, Kenneth; Córdova-Esparza, Diana-Margarita; Terven, Juan R.; Herrera-Navarro, Ana M.; Ramírez, Ma. Teresa García; Ramírez-Pedraza, Alfonso

doi:10.3390/app12115523

Cited by 21 publications

(28 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Amon them, Amorim et al developed 67 skeleton key points based on 20 classes of the ASLLVD dataset, which include word level sign language word recognition using the enhancement of [21] by adding graphic layout [22]. Solis et al collected 30 Mexican sign language (MSL) word skeletons considering body, face and hand information using a spatial camera, then applied RNN and LSTM and achieved good ac-curacy [14]. Xia et al made a dataset by considering 67 whole body key points and achieved satisfactory performance using RNN with their self-development dataset [23].…”

Section: Related Workmentioning

confidence: 99%

“…This dataset was collected from 30 different MSL signs with 25 repetitions for each sign. They recorded 3000 samples for the dataset in total, specifically 20 videos for each sign and extracted 20 frames from each video [14].…”

Section: B Msl Datasetmentioning

confidence: 99%

“…In Table 7, we present the performance accuracy of our proposed model in the context of hand gesture recognition, specifically focusing on sign language interpretation using the MSL dataset. A significant benchmark in sign language interpretation was set by the work of the author in [47], where multiple model architectures based on recurrent neural networks, including long short-term memory (LSTM) and recurrent gated units (GRU), were applied. Achieving a commendable accuracy of 96.44%, their competence in capturing temporal dependencies within Full Body joint points was evident.…”

Section: F Performance Accuracy and State Of The Art Comparison For T...mentioning

confidence: 99%

“…Vision-based sign language recognition systems lead to the challenges of the sensor-based system by considering the scalability, flexibility, and low-cost approach, which make it popular among researchers. Most of the computer vision researchers have been working to develop a sign language recognition (SLR) system with two marialites: pixel-based image [3], [4], [13] and skeleton point of the image [13], [14]. In the image-based system, they took RGB images as the input and applied different machine learning and deep learning on it either on the handcrafted feature or directly on pixel values.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Sign Language Recognition Using Graph and General Deep Neural Network Based on Large Scale Dataset

Miah,

Hasan,

Nishimura

et al. 2024

IEEE Access

View full text Add to dashboard Cite

Sign Language Recognition (SLR) represents a revolutionary technology aiming to establish communication between deaf and non-deaf communities, surpassing traditional interpreter-based approaches. Existing efforts in automatic sign recognition predominantly rely on hand skeleton joint information, steering clear of image pixels to address challenges like partial occlusion and redundant backgrounds. Many researchers have been working to develop automatic sign recognition using hand skeleton joint information instead of image pixels to overcome partial occlusion and redundant background problems. However, body motion and facial expression play an essential role in increasing the inner gesture variance in expressing sign language emotion besides hand information for large-scale sign word datasets. Recently, some researchers have been working to develop muti-gesture-based SLR recognition systems, but their performance accuracy and efficiency are unsatisfactory for real-time deployment. Addressing these limitations, we propose a novel approach -a two-stream multistage graph convolution with attention and residual connection (GCAR) -designed to extract spatial-temporal contextual information. The multistage GCAR system, incorporating a channel attention module, dynamically enhances attention levels, particularly for non-connected skeleton points during specific events within spatial-temporal features. The methodology involves capturing joint skeleton points and motion, offering a comprehensive understanding of a person's entire body movement during sign language gestures and feeding this information into two streams. In the first stream, joint key features undergo processing through sep-TCN, graph convolution, deep learning layer, and a channel attention module across multiple stages, generating intricate spatial-temporal features in sign language gestures. Simultaneously, the joint motion is processed in the second stream, mirroring the steps of the first branch. The fusion of these two features yields the final feature vector, which is then fed into the classification module. The model excels in capturing discriminative structural displacements and short-range dependencies by leveraging unified joint features projected onto a high-dimensional space. Owing to the effectiveness of these features, the proposed method achieved significant accuracies: 90.31%, 94.10%, 99.75%, and 34.41%, for the WLASL, PSL, MSL, and ASLLVD large-scale datasets, respectively, with 0.69 million parameters. The high-performance accuracy, coupled with stable computational complexity, demonstrates the superiority of the proposed model. This innovative approach is anticipated to redefine the landscape of sign language recognition, setting a new standard in the field.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: B Msl Datasetmentioning

confidence: 99%

Section: F Performance Accuracy and State Of The Art Comparison For T...mentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Sign Language Recognition Using Graph and General Deep Neural Network Based on Large Scale Dataset

Miah,

Hasan,

Nishimura

et al. 2024

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Mejia-Perez et al assessed different architectures of recurrent neural networks (RNNs) for the recognition of 26 dynamic words and four static alphabet letters [20]. Four people performed each sign 25 times against a controlled background while wearing black clothes.…”

Section: Related Workmentioning

confidence: 99%

Use of Spherical and Cartesian Features for Learning and Recognition of the Static Mexican Sign Language Alphabet

et al. 2022

View full text Add to dashboard Cite

The automatic recognition of sign language is very important to allow for communication by hearing impaired people. The purpose of this study is to develop a method of recognizing the static Mexican Sign Language (MSL) alphabet. In contrast to other MSL recognition methods, which require a controlled background and permit changes only in 2D space, our method only requires indoor conditions and allows for variations in the 3D pose. We present an innovative method that can learn the shape of each of the 21 letters from examples. Before learning, each example in the training set is normalized in the 3D pose using principal component analysis. The input data are created with a 3D sensor. Our method generates three types of features to represent each shape. When applied to a dataset acquired in our laboratory, an accuracy of 100% was obtained. The features used by our method have a clear, intuitive geometric interpretation.

show abstract

Fingerspelling Recognition in Mexican Sign Language (LSM) Using Machine Learning

Morfín-Chávez,

Gortarez-Pelayo,

Lopez-Nava

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Automatic Recognition of Mexican Sign Language Using a Depth Camera and Recurrent Neural Networks

Cited by 21 publications

References 25 publications

Sign Language Recognition Using Graph and General Deep Neural Network Based on Large Scale Dataset

Sign Language Recognition Using Graph and General Deep Neural Network Based on Large Scale Dataset

Use of Spherical and Cartesian Features for Learning and Recognition of the Static Mexican Sign Language Alphabet

Fingerspelling Recognition in Mexican Sign Language (LSM) Using Machine Learning

Contact Info

Product

Resources

About