2022
DOI: 10.1109/tmm.2021.3070438
|View full text |Cite
|
Sign up to set email alerts
|

A Comprehensive Study on Deep Learning-Based Methods for Sign Language Recognition

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
23
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 79 publications
(38 citation statements)
references
References 59 publications
0
23
0
Order By: Relevance
“…Sincan et al in [ 16 ], captured isolated Turkish sign language glosses using Kinect sensors with a large variety of indoor and outdoor backgrounds, revealing the importance of capturing videos with various backgrounds. Adaloglou et al in [ 17 ], created a large sign language dataset with RealSense D435 sensor that records both RGB and depth information. The dataset contain continuous and isolated sign videos and is appropriate for both isolated and continuous sign language recognition tasks.…”
Section: Sign Language Capturingmentioning
confidence: 99%
See 1 more Smart Citation
“…Sincan et al in [ 16 ], captured isolated Turkish sign language glosses using Kinect sensors with a large variety of indoor and outdoor backgrounds, revealing the importance of capturing videos with various backgrounds. Adaloglou et al in [ 17 ], created a large sign language dataset with RealSense D435 sensor that records both RGB and depth information. The dataset contain continuous and isolated sign videos and is appropriate for both isolated and continuous sign language recognition tasks.…”
Section: Sign Language Capturingmentioning
confidence: 99%
“…GRSL [ 15 ] is another CSLR dataset of Greek sign language that is used in home care services, which contains multiple modalities, such as RGB, depth and skeletal joints. On the other hand, GSL [ 17 ] is a large Greek sign language dataset created to assist communication of Deaf people with public service employees. The dataset was created with a RealSense D435 sensor that records both RGB and depth information.…”
Section: Sign Language Capturingmentioning
confidence: 99%
“…Random sampling or discarding of frames is one of the most straightforward techniques found in literature, where approximately 20% of input is eliminated. In [89], this technique is complemented by random changes of brightness, saturation, and other image parameters. Some of the data augmentation methods used in [90] include Gaussian Noise, Just Counter, and Future Prediction.…”
Section: ) Normalization and Filteringmentioning
confidence: 99%
“…Therefore, feature extraction is another necessary element of all transformer-based neural models, where the most relevant features derived from input tokens are selected and later used for model training [85], [136]. Some of the features delineate between signs (inter-cue features), while others are useful to differentiate the particular gloss from similar ones (intra-cue features) [89], [137].…”
Section: ) Transformer-based Approachmentioning
confidence: 99%
See 1 more Smart Citation