An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion

Luqman, Hamzah

doi:10.1109/access.2022.3204110

Cited by 19 publications

(16 citation statements)

References 58 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The feature learning phase receives these two volumes, out of which one of them is dedicated to the hand region, another represents the whole gesture area. [11] B. Feature Learning Hand configuration's precise spatial and temporal properties are learned by the maiden C3D instance.…”

Section: Methodology a Input Preprocessingmentioning

confidence: 99%

“…The result of this step is two feature vectors, each with 4096 size. [11] C. Feature Fusion and Classification After dimension reduction, we may acquire a precise representation of integrated features, resulting in reduced computing difficulty and higher face identification accuracy. Feature fusion aids in the complete learning of picture characteristics for the description of their rich internal information.…”

Section: Methodology a Input Preprocessingmentioning

confidence: 99%

“…Combining training picture features vector from the common weight network layer with extracted features made up of other numerical data allows the proposed model to use as many features as feasible for the subsequent classification. [11] Frame work…”

Section: Methodology a Input Preprocessingmentioning

confidence: 99%

See 2 more Smart Citations

A Review on Dynamic Hand Gesture Recognition Techniques

A¹,

Harish²,

Sambhrama³

et al. 2023

IJRASET

View full text Add to dashboard Cite

Body language is one of the nonverbal methodsof communication, and it comprises hand gestures, arm movements, posturing, and gestures and facial expressions. One way to communicate information through the movement of the body is through gestures. HGR is a smart, intuitive, and easy method of human-computer interaction (HCI). HGR systems have two key applications: SLR and GBC. To help the deaf communicate with the hearing community, SLR tries to automaticallyinterpret SLs via a computer. The idea that SL is a highly ordered and primarily symbolic collection of human gestures is what led to the development of universal gesture-based HCI.

show abstract

Section: Methodology a Input Preprocessingmentioning

confidence: 99%

Section: Methodology a Input Preprocessingmentioning

confidence: 99%

See 1 more Smart Citation

A Review on Dynamic Hand Gesture Recognition Techniques

A¹,

Harish²,

Sambhrama³

et al. 2023

IJRASET

View full text Add to dashboard Cite

show abstract

“…At the same time, gesture recognition can help create a richer and more interactive learning experience in online teaching environments [ 4 ]. For special education, such as students with hearing impairments, gesture recognition can also be used to identify and learn sign language [ 5 , 6 ]. Teaching gesture recognition brings many possibilities to education by improving teaching quality and enhancing student learning experience.…”

Section: Introductionmentioning

confidence: 99%

ST-TGR: Spatio-Temporal Representation Learning for Skeleton-Based Teaching Gesture Recognition

Chen,

Huang,

Liu

et al. 2024

Sensors

View full text Add to dashboard Cite

Teaching gesture recognition is a technique used to recognize the hand movements of teachers in classroom teaching scenarios. This technology is widely used in education, including for classroom teaching evaluation, enhancing online teaching, and assisting special education. However, current research on gesture recognition in teaching mainly focuses on detecting the static gestures of individual students and analyzing their classroom behavior. To analyze the teacher’s gestures and mitigate the difficulty of single-target dynamic gesture recognition in multi-person teaching scenarios, this paper proposes skeleton-based teaching gesture recognition (ST-TGR), which learns through spatio-temporal representation. This method mainly uses the human pose estimation technique RTMPose to extract the coordinates of the keypoints of the teacher’s skeleton and then inputs the recognized sequence of the teacher’s skeleton into the MoGRU action recognition network for classifying gesture actions. The MoGRU action recognition module mainly learns the spatio-temporal representation of target actions by stacking a multi-scale bidirectional gated recurrent unit (BiGRU) and using improved attention mechanism modules. To validate the generalization of the action recognition network model, we conducted comparative experiments on datasets including NTU RGB+D 60, UT-Kinect Action3D, SBU Kinect Interaction, and Florence 3D. The results indicate that, compared with most existing baseline models, the model proposed in this article exhibits better performance in recognition accuracy and speed.

show abstract

“…Skeletal hand features can be obtained from sensor devices [4], such as depth cameras, or hand pose estimators, which use deep learning to detect and track the key points of the hand skeleton from images or videos. Some sign language models use skeleton-based methods alone, while others combine them with other modalities, such as RGB and depth, to provide complementary information [13]. Hamza [13] proposes an efficient hand key posture method for isolated sign language recognition to address variations in background and lighting.…”

Section: Introductionmentioning

confidence: 99%

Enhanced Weak Spatial Modeling Through CNN-Based Deep Sign Language Skeletal Feature Transformation

Alamri,

Bala Abdullahi,

Khan

et al. 2024

IEEE Access

View full text Add to dashboard Cite

Recent sign language skeletal-based feature models (SLSm) consist of various distracting coordinates that lead to complex deep-learning modeling. However, SLSm is not purely a spatial-temporal coordinate arrangement problem; it is also limited by human dynamics and feature aggregations. The objectives of this work are twofold: (a) to transform the skeletal features of the SLSm model to address the problem of variations in viewpoint and changes across features of repeated signs due to human dynamics, and (b) to exploit the potential of exhaustive searching in dropping distracting features to prevent complex deep learning modeling. Method: We propose a transformed skeletal feature-based model (SCT) from a feature thresholding theory. We first extract the hand-skeletal joint-related features relevant to the coordinates and positions of the hand transcription that efficiently capture human dynamics. The extracted features are transformed into a subset of a predefined threshold and fed into the proposed ensemble exhaustive feature searching. The searched features are transformed into their equivalent deep input image sequences. Outcomes: By leveraging the skeletal-based transformed and deep spatial features, the proposed method demonstrates robust performance in sign language recognition, surpassing recent deep learning models in accuracy and simplicity. The proposed skeletal features demonstrate superiority in learning complex hand gestures of public data sets, improving accuracy by more than 2%.INDEX TERMS Human-computer interaction, End-to-end deep neural network, Multimodal data interaction, Hand gestures, Sign language recognition, and Pattern recognition.

show abstract

An Efficient Two-Stream Network for Isolated Sign Language Recognition Using Accumulative Video Motion

Cited by 19 publications

References 58 publications

A Review on Dynamic Hand Gesture Recognition Techniques

A Review on Dynamic Hand Gesture Recognition Techniques

ST-TGR: Spatio-Temporal Representation Learning for Skeleton-Based Teaching Gesture Recognition

Enhanced Weak Spatial Modeling Through CNN-Based Deep Sign Language Skeletal Feature Transformation

Contact Info

Product

Resources

About