American Sign Language Words Recognition Using Spatio-Temporal Prosodic and Angle Features: A Sequential Learning Approach

Abdullahi, Sunusi Bala; Chamnongthai, Kosin

doi:10.1109/access.2022.3148132

Cited by 37 publications

(30 citation statements)

References 60 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The second phase of the BiRNN layers is trained to learn output of the previous layers to be initial state of first layers and yields output vector

], and it is defined as:

, where

. Finally, BiRNN extraction layers can be written uniformly as [ 43 ]:

where dominant and non-dominant hand index is denoted as n , and (

) and

denote forward and backward pass hidden state vectors, respectively. In Equation (26), the extraction layers of BiRNN not only give the relationship of video input features vector but also correlate to state of prior sequence.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

American Sign Language Words Recognition of Skeletal Videos Using Processed Video Driven Multi-Stacked Deep LSTM

Abdullahi

Chamnongthai

2022

Sensors

Self Cite

View full text Add to dashboard Cite

Complex hand gesture interactions among dynamic sign words may lead to misclassification, which affects the recognition accuracy of the ubiquitous sign language recognition system. This paper proposes to augment the feature vector of dynamic sign words with knowledge of hand dynamics as a proxy and classify dynamic sign words using motion patterns based on the extracted feature vector. In this method, some double-hand dynamic sign words have ambiguous or similar features across a hand motion trajectory, which leads to classification errors. Thus, the similar/ambiguous hand motion trajectory is determined based on the approximation of a probability density function over a time frame. Then, the extracted features are enhanced by transformation using maximal information correlation. These enhanced features of 3D skeletal videos captured by a leap motion controller are fed as a state transition pattern to a classifier for sign word classification. To evaluate the performance of the proposed method, an experiment is performed with 10 participants on 40 double hands dynamic ASL words, which reveals 97.98% accuracy. The method is further developed on challenging ASL, SHREC, and LMDHG data sets and outperforms conventional methods by 1.47%, 1.56%, and 0.37%, respectively.

show abstract

“…The second phase of the BiRNN layers is trained to learn output of the previous layers to be initial state of first layers and yields output vector

], and it is defined as:

, where

. Finally, BiRNN extraction layers can be written uniformly as [ 43 ]:

where dominant and non-dominant hand index is denoted as n , and (

) and

Section: Methodsmentioning

confidence: 99%

“…Multi-stacked BiLSTM is trained to obtain output probability vectors for all of its corresponding input vectors, predicted word classes, and confusion matrices. Multi-stacked layers are initialized with weight of extracted features, as follows [ 43 ]:

…”

Section: Methodsmentioning

confidence: 99%

American Sign Language Words Recognition of Skeletal Videos Using Processed Video Driven Multi-Stacked Deep LSTM

Abdullahi

Chamnongthai

2022

Sensors

Self Cite

View full text Add to dashboard Cite

show abstract

“…Sign language is a medium of communication for hearing-impaired people, a group which includes 466 million people worldwide [1], and it is expressed using the fingers and hands. American Sign Language is the most populous among its peers, and thus consists of over ten thousand word gestures [2]. Moreover, 65% of ASL gestures represent sign words during a full conversation [3].…”

Section: Introductionmentioning

confidence: 99%

“…In addition, these features are used with a forehand view, which leads to misclassification; therefore, it is difficult to apply this method with a backhand view. The authors in [2,3,6,11,12] extracted hand features using a backhand view approach, which led to outstanding performance; however, these studies may fail to recognize words with a similar shape, rotation, and movement (SRM words). To address this problem with the existing works of [2,3,6,11,12], in this paper we propose the spatial-temporal body parts and hand relationship patterns (ST-BHR patterns) as the main feature of analysis for 72 isolated signed words in the SRM group [13] based on the backhand approach.…”

Section: Introductionmentioning

confidence: 99%

“…The authors in [2,3,6,11,12] extracted hand features using a backhand view approach, which led to outstanding performance; however, these studies may fail to recognize words with a similar shape, rotation, and movement (SRM words). To address this problem with the existing works of [2,3,6,11,12], in this paper we propose the spatial-temporal body parts and hand relationship patterns (ST-BHR patterns) as the main feature of analysis for 72 isolated signed words in the SRM group [13] based on the backhand approach. In this method, both single and double hands were applied, and their information was obtained using the 3D distance-based Cartesian product obtained from a 3D depth camera.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Backhand-Approach-Based American Sign Language Words Recognition Using Spatial-Temporal Body Parts and Hand Relationship Patterns

Chophuk

Chamnongthai

Chinnasarn

2022

Sensors

Self Cite

View full text Add to dashboard Cite

Most of the existing methods focus mainly on the extraction of shape-based, rotation-based, and motion-based features, usually neglecting the relationship between hands and body parts, which can provide significant information to address the problem of similar sign words based on the backhand approach. Therefore, this paper proposes four feature-based models. The spatial–temporal body parts and hand relationship patterns are the main feature. The second model consists of the spatial–temporal finger joint angle patterns. The third model consists of the spatial–temporal 3D hand motion trajectory patterns. The fourth model consists of the spatial–temporal double-hand relationship patterns. Then, a two-layer bidirectional long short-term memory method is used to deal with time-independent data as a classifier. The performance of the method was evaluated and compared with the existing works using 26 ASL letters, with an accuracy and F1-score of 97.34% and 97.36%, respectively. The method was further evaluated using 40 double-hand ASL words and achieved an accuracy and F1-score of 98.52% and 98.54%, respectively. The results demonstrated that the proposed method outperformed the existing works under consideration. However, in the analysis of 72 new ASL words, including single- and double-hand words from 10 participants, the accuracy and F1-score were approximately 96.99% and 97.00%, respectively.

show abstract

An improved custom convolutional neural network based hand sign recognition using machine learning algorithm

Moon,

Yenurkar,

Nyangaresi

et al. 2024

Engineering Reports

View full text Add to dashboard Cite

The biggest challenge the deaf and dumb group faces is that individuals around them do not understand sign language, which they use to communicate with one another. Written communication is slower than face‐to‐face contact, despite the fact that it can be used. Many sign languages have been developed around the world because they are more effective in emergency situations than text‐based communication. India in‐spite of having the large deaf population of almost 18 million and having only around 250 trained/untrained; skilled interpreters. The proposed system can utilize a custom convolution neural networks (CCNNs) model to identify hand motions in order to resolve this issue. This system uses a filter to process the hand before sending it through a classifier to identify the type of hand movements. CCNN strategy employs two levels of algorithm to predict and evaluate symbols that are increasingly similar to one another in order to get as close to precisely recognizing the symbol presented as possible. Convolutional neural networks (CNNs) are able to precisely identify a variety of gestures after being trained on large datasets of hand sign photographs. As a result of their frequent usage of many layers of filters and pooling to extract relevant information from the input images, these networks can recognize hand signs with an accuracy rate of 99.95%, which is much greater than previously built models like SIGNGRAPH, SVM, KNN, CNN + Bi‐LSTM, 3D‐CNN and 2D CNN network and 1D CNN skeleton network. The simulation result shows that a suggested CCNN‐based learning approach is useful for hand sign detection and future usage research when compared with existing machine learning models.

show abstract

American Sign Language Words Recognition Using Spatio-Temporal Prosodic and Angle Features: A Sequential Learning Approach

Cited by 37 publications

References 60 publications

American Sign Language Words Recognition of Skeletal Videos Using Processed Video Driven Multi-Stacked Deep LSTM

American Sign Language Words Recognition of Skeletal Videos Using Processed Video Driven Multi-Stacked Deep LSTM

Backhand-Approach-Based American Sign Language Words Recognition Using Spatial-Temporal Body Parts and Hand Relationship Patterns

An improved custom convolutional neural network based hand sign recognition using machine learning algorithm

Contact Info

Product

Resources

About