2023
DOI: 10.3390/s23229068
|View full text |Cite
|
Sign up to set email alerts
|

Toward a Vision-Based Intelligent System: A Stacked Encoded Deep Learning Framework for Sign Language Recognition

Muhammad Islam,
Mohammed Aloraini,
Suliman Aladhadh
et al.

Abstract: Sign language recognition, an essential interface between the hearing and deaf-mute communities, faces challenges with high false positive rates and computational costs, even with the use of advanced deep learning techniques. Our proposed solution is a stacked encoded model, combining artificial intelligence (AI) with the Internet of Things (IoT), which refines feature extraction and classification to overcome these challenges. We leverage a lightweight backbone model for preliminary feature extraction and use… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 58 publications
0
3
0
Order By: Relevance
“…The exceptional performance of the hand gesture recognition system is rooted in meticulous dataset preparation, encompassing a diverse array of lighting conditions and subjects, in conjunction with deploying the EfficientNet B3 model within the CNN framework [26]. The scalability and efficiency inherent to this model were instrumental in achieving a balanced and effective learning process, thereby facilitating the system's ability to recognize gestures with high accuracy under varying environmental conditions and across different individuals [27].…”
Section: Discussionmentioning
confidence: 99%
“…The exceptional performance of the hand gesture recognition system is rooted in meticulous dataset preparation, encompassing a diverse array of lighting conditions and subjects, in conjunction with deploying the EfficientNet B3 model within the CNN framework [26]. The scalability and efficiency inherent to this model were instrumental in achieving a balanced and effective learning process, thereby facilitating the system's ability to recognize gestures with high accuracy under varying environmental conditions and across different individuals [27].…”
Section: Discussionmentioning
confidence: 99%
“…The work in [22] proposed an LSTM-based network, further enriched by the determinantal point process. Since this groundbreaking work, LSTM has emerged as a cornerstone for VS, with frequently advanced techniques developing [25,33,34]. For instance, the work [33] introduced a novel loss to gauge the fidelity of predicted summaries in preserving original semantic information.…”
Section: Supervised and Unsupervised Video Summarizationmentioning
confidence: 99%
“…Since this groundbreaking work, LSTM has emerged as a cornerstone for VS, with frequently advanced techniques developing [25,33,34]. For instance, the work [33] introduced a novel loss to gauge the fidelity of predicted summaries in preserving original semantic information. The study [35] devises VS as a temporal interest detection challenge addressed by the anticipated DSNet.…”
Section: Supervised and Unsupervised Video Summarizationmentioning
confidence: 99%