On the role of multimodal learning in the recognition of sign language

Ferreira, Pedro M.; Cardoso, Jaime S.; Rebelo, Ana

doi:10.1007/s11042-018-6565-5

Cited by 22 publications

(8 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The large-scale dataset-based investigation could be future work to improve recognition accuracy. [92] isoGD, SBU, NATOPS, SKIG RGB, Depth, Dynamic Rastgoo et al [3] RKS-PERSIANSIGN, NYU RGB, Dynamic Köpüklü et al [93] EgoGesture, NVIDIA benchmarks Lim et al [94] RWTH-BOSTON-50, ASLLVD Chen et al [95] DHG-14/28 Dataset, SHREC'17 Track Dataset Ferreira et al [96] Real video samples RGB, Depth, Static Gomez-Donoso et al [97] STB Spurr et al [98] NYU, STB, MSRA, ICVL Kazakos et al [99] NYU Li et al [100] B2RGB-SH, STB RGB, Static Mueller et al [101] EgoDexter, Dexter, STB Victor [102] Egohands Baek et al [103] BigHand2.2M, MSRA, ICVL, NYU Depth, Static Moon et al [104] MSRA, ICVL, NYU Ge et al [105] MSRA, ICVL, NYU Ge and et. al [106] MSRA, NYU Dibra et al [107] ICVL, NYU Sinha et al [108] NYU Zimmermann and Brox [109] Dexter, STB 3D, RGB Marin-Jimenez et al [110] UBC3V, ITOP 3D, Depth Deng et al [111] NYU Oberweger et al [112] MSRA Oberweger et al [113] NYU Rastgoo et al [114] Massey 2012, ASL Fingerspelling A, SL Surrey 2D, Depth, RGB Duan et al [115] RGBD-HuDaAct, isoGD Chen et al [116] NYU, ICVL, MSRA 2D, Depth Dadashzadeh et al [117] OUHANDS Wang et al [118] Human3.6M Yuan et al [119] BigHand2.2M, MSRA, ICVL, NYU Guo et al [120] ITOP, MSRA, ICVL, NYU Fang and Lei [121] ICVL, NYU Madadi et al [122] MSRA, NYU Wang et al [123] isoGD Haque et al [124] EVAL, ITOP Tagliasacchi et al [125] Real video samples Rastgoo et al…”

Section: A Manual Slrmentioning

confidence: 99%

A Comprehensive Review of Sign Language Recognition: Different Types, Modalities, and Datasets

Madhiarasan¹,

Roy²

2022

Preprint

View full text Add to dashboard Cite

A machine can understand human activities, and the meaning of signs can help overcome the communication barriers between the inaudible and ordinary people. Sign Language Recognition (SLR) is a fascinating research area and a crucial task concerning computer vision and pattern recognition. Recently, SLR usage has increased in many applications, but the environment, background image resolution, modalities, and datasets affect the performance a lot. Many researchers have been striving to carry out generic real-time SLR models. This review paper facilitates a comprehensive overview of SLR and discusses the needs, challenges, and problems associated with SLR. We study related works about manual and non-manual, various modalities, and datasets. Research progress and existing stateof-the-art SLR models over the past decade have been reviewed. Finally, we find the research gap and limitations in this domain and suggest future directions. This review paper will be helpful for readers and researchers to get complete guidance about SLR and the progressive design of the state-of-the-art SLR model.

show abstract

Section: A Manual Slrmentioning

confidence: 99%

A Comprehensive Review of Sign Language Recognition: Different Types, Modalities, and Datasets

Madhiarasan¹,

Roy²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Kumar et al [1] used Kinect and Leap Motion sensors for data acquisition and achieved 96.33% accuracy on a 50-sign Indian sign language dataset when they used both of the data modalities. Ferreira et al [24] developed a multimodal SLR system to recognize 10 motionless signs in the American sign language. In their study, they used RGB, depth and 3D skeletal data obtained by Leap Motion.…”

Section: Related Workmentioning

confidence: 99%

Turkish sign language recognition based on multistream data fusion

Gündüz¹,

Polat²

2021

Turk J Elec Eng & Comp Sci

View full text Add to dashboard Cite

Sign languages are non-verbal, visual languages that hearing or speech impaired people use for communication. Aside from hands, other communication channels such as body posture and facial expressions are also valuable in sign languages. As a result of the fact that the gestures in sign languages vary across countries, the significance of communication channels in each sign language also differs. In this study, representing the communication channels used in Turkish sign language, a total of 8 different data streams-4 RGB, 3 pose, 1 optical flow-were analyzed. Inception 3D was used for RGB and optical flow; and LSTM-RNN was used for pose data streams. Experiments were conducted by merging the data streams in different combinations, and then a sign language recognition system that merged the most suitable streams with the help of a multistream late fusion mechanism was proposed. Considering each data stream individually, the accuracies of the RGB streams were between 28% and 79%; pose stream accuracies were between 9% and 50%; and optical flow data accuracy was 78.5%. When these data streams were used in combination, the sign language recognition performance was higher in comparison to any of the data streams alone. The proposed sign language recognition system uses a multistream data fusion mechanism and gives an accuracy of 89.3% on BosphorusSign General dataset. The multistream data fusion mechanisms have a great potential for improving sign language recognition results.

show abstract

“…In order to extract the manual signs from the noisy background of the images, the automatic hand detection algorithm [28] is used as a pre-processing step. The images are then cropped, resized to the average sign size of the training set, and normalized to be in the range [−1, 1].…”

Section: A Implementation Detailsmentioning

confidence: 99%

“…The images are then cropped, resized to the average sign size of the training set, and normalized to be in the range [−1, 1]. Throughout this section, the proposed model is compared with state-of-the-art methods for each dataset [15], [16], [24], [27], [28]. Nevertheless, to further attest the robustness of the proposed model, two different baselines are also implemented: 1) (Baseline 1) A CNN trained from scratch with ℓ -2 regularization.…”

Section: A Implementation Detailsmentioning

confidence: 99%

Signer-Independent Sign Language Recognition with Adversarial Neural Networks

Ferreira¹,

Multimedia²,

Pernes³

et al. 2021

IJMLC

Self Cite

View full text Add to dashboard Cite

Sign Language Recognition (SLR) has become an appealing topic in modern societies because such technology can ideally be used to bridge the gap between deaf and hearing people. Although important steps have been made towards the development of real-world SLR systems, signer-independent SLR is still one of the bottleneck problems of this research field. In this regard, we propose a deep neural network along with an adversarial training objective, specifically designed to address the signer-independent problem. Specifically, the proposed model consists of an encoder, mapping from input images to latent representations, and two classifiers operating on these underlying representations: (i) the sign-classifier, for predicting the class/sign labels, and (ii) the signer-classifier, for predicting their signer identities. During the learning stage, the encoder is simultaneously trained to help the sign-classifier as much as possible while trying to fool the signer-classifier. This adversarial training procedure allows learning signer-invariant latent representations that are in fact highly discriminative for sign recognition. Experimental results demonstrate the effectiveness of the proposed model and its capability of dealing with the large inter-signer variations.

show abstract

On the role of multimodal learning in the recognition of sign language

Cited by 22 publications

References 22 publications

A Comprehensive Review of Sign Language Recognition: Different Types, Modalities, and Datasets

A Comprehensive Review of Sign Language Recognition: Different Types, Modalities, and Datasets

Turkish sign language recognition based on multistream data fusion

Signer-Independent Sign Language Recognition with Adversarial Neural Networks

Contact Info

Product

Resources

About