2021
DOI: 10.1145/3463498
|View full text |Cite
|
Sign up to set email alerts
|

Enabling Real-time Sign Language Translation on Mobile Platforms with On-board Depth Cameras

Abstract: In this work we present SUGO, a depth video-based system for translating sign language to text using a smartphone's front camera. While exploiting depth-only videos offer benefits such as being less privacy-invasive compared to using RGB videos, it introduces new challenges which include dealing with low video resolutions and the sensors' sensitiveness towards user motion. We overcome these challenges by diversifying our sign language video dataset to be robust to various usage scenarios via data augmentation … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
9
1

Relationship

2
8

Authors

Journals

citations
Cited by 25 publications
(5 citation statements)
references
References 49 publications
0
5
0
Order By: Relevance
“…The depth stream assists in learning more complex features by ignoring the video background. Recently, the SUGO model based on 3D-CNN is proposed that uses data acquired through LIDAR [31]. Appearance-based methods for SLR critically suffer from high computational complexity in terms of memory requirements and processing power and have significantly low accuracies.…”
Section: Related Workmentioning
confidence: 99%
“…The depth stream assists in learning more complex features by ignoring the video background. Recently, the SUGO model based on 3D-CNN is proposed that uses data acquired through LIDAR [31]. Appearance-based methods for SLR critically suffer from high computational complexity in terms of memory requirements and processing power and have significantly low accuracies.…”
Section: Related Workmentioning
confidence: 99%
“…The low complexity of such calculations implies low processing power requirements [7], and, thus, ToF cameras are expected to become ubiquitously available on smartphones [15,81]. In fact, ToF cameras on smartphones have been recently used in novel applications such as translating sign language [57], recognizing hand gestures [86], and improving lighting estimation for AR [97]. ToF cameras are known to suffer from several systematic errors caused by the internal temperature of the camera, integration time settings, and infrared demodulation errors [18,19].…”
Section: A Primer On Tof Cameras In Ar Devicesmentioning
confidence: 99%
“…introduced to clinical applications and are being applied to various domains [11][12][13][14][15][16][17][18]. In this work, we follow the paradigm of exploiting machine learning algorithms together with clinical data and compare candidate models that can suit our purposes.…”
Section: Plos Onementioning
confidence: 99%