2021
DOI: 10.48550/arxiv.2110.06161
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Sign Language Recognition via Skeleton-Aware Multi-Model Ensemble

Abstract: Sign language is commonly used by deaf or mute people to communicate but requires extensive effort to master. It is usually performed with the fast yet delicate movement of hand gestures, body posture, and even facial expressions. Current Sign Language Recognition (SLR) methods usually extract features via deep neural networks and suffer overfitting due to limited and noisy data. Recently, skeleton-based action recognition has attracted increasing attention due to its subject-invariant and background-invariant… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(19 citation statements)
references
References 66 publications
(77 reference statements)
0
19
0
Order By: Relevance
“…The Skeleton Aware multi-stream sign language recognition framework is one of the most recent graph-based systems for sign language recognition [32,33]. These frameworks combined the ST-GCN [31] with other input channels such as RGB frames and optical flow; in a multimodality scheme, the different modalities are integrated and fused at different levels.…”
Section: Related Workmentioning
confidence: 99%
“…The Skeleton Aware multi-stream sign language recognition framework is one of the most recent graph-based systems for sign language recognition [32,33]. These frameworks combined the ST-GCN [31] with other input channels such as RGB frames and optical flow; in a multimodality scheme, the different modalities are integrated and fused at different levels.…”
Section: Related Workmentioning
confidence: 99%
“…Jiang et al [142] devised a SAM SLR (Skeleton Aware multimodal framework Sign language recognition) concerning isolated sign language recognition. The skeleton-aware multi-modal (SSTCN-Separable spatial-temporal convolution network) results in better accuracy on the AUTSL dataset, with a top 1 accuracy of 98.42% for RGB and 98.53% for RGB RGB-D. Papastratis et al [143] Jiang et al [145] designed SMA-SLR-v2 (Skeleton aware multimodal framework with global ensemble model) based on isolated sign language recognition. They achieved the top 1 accuracy of 98.53% for AUSTL (RGBD all), the top 1 accuracy of 59.39% for the WLASL2000 dataset per instance case, and 56.63% per class, and the top 1 accuracy of 99% accuracy for isolated SLR 500 dataset.…”
Section: B Study Of Current State-of-the-art Models For Sign Language...mentioning
confidence: 99%
“…More recently, the success of pose estimation techniques and Graph Convolutional Network (GCN) architectures has shifted researchers' attention to skeleton-based approaches in both action recognition and SLR domains (Kipf and Welling, 2016 ; Yan et al, 2018 ; Cao et al, 2019 ; Jiang et al, 2021 ). In these methods, graphs are often formed by connecting skeleton joint information (obtained via pose estimation techniques) according to the natural human body connections and processed through a GCN network.…”
Section: Introductionmentioning
confidence: 99%
“…In these methods, graphs are often formed by connecting skeleton joint information (obtained via pose estimation techniques) according to the natural human body connections and processed through a GCN network. As an improvement over earlier GCN architectures, ST-GCN has been proposed for skeleton-based action recognition to model spatial and temporal dimensions simultaneously and later was adapted to the SLR problem (Yan et al, 2018 ; Jiang et al, 2021 ).…”
Section: Introductionmentioning
confidence: 99%