Ilias Papastratis scite author profile

Continuous Sign Language Recognition (CSLR) refers to the challenging problem of recognizing sign language glosses and their temporal boundaries from weakly annotated video sequences. Previous methods focus mostly on visual feature extraction neglecting text information and failing to effectively model the intra-gloss dependencies. In this work, a cross-modal learning approach that leverages text information to improve vision-based CSLR is proposed. To this end, two powerful encoding networks are initially used to produce video and text embeddings prior to their mapping and alignment into a joint latent representation. The purpose of the proposed cross-modal alignment is the modelling of intra-gloss dependencies and the creation of more descriptive video-based latent representations for CSLR. The proposed method is trained jointly with video and text latent representations. Finally, the aligned video latent representations are classified using a jointly trained decoder. Extensive experiments on three well-known sign language recognition datasets and comparison with state-of-the-art approaches demonstrate the great potential of the proposed approach.

show abstract

Artificial Intelligence Technologies for Sign Language

Papastratis

Chatzikonstantinou

Konstantinidis

et al. 2021

Sensors

View full text Add to dashboard Cite

AI technologies can play an important role in breaking down the communication barriers of deaf or hearing-impaired people with other communities, contributing significantly to their social inclusion. Recent advances in both sensing technologies and AI algorithms have paved the way for the development of various applications aiming at fulfilling the needs of deaf and hearing-impaired communities. To this end, this survey aims to provide a comprehensive review of state-of-the-art methods in sign language capturing, recognition, translation and representation, pinpointing their advantages and limitations. In addition, the survey presents a number of applications, while it discusses the main challenges in the field of sign language technologies. Future research direction are also proposed in order to assist prospective researchers towards further advancing the field.

show abstract

Continuous Sign Language Recognition through a Context-Aware Generative Adversarial Network

Papastratis

Dimitropoulos

Daras

2021

Sensors

View full text Add to dashboard Cite

Continuous sign language recognition is a weakly supervised task dealing with the identification of continuous sign gestures from video sequences, without any prior knowledge about the temporal boundaries between consecutive signs. Most of the existing methods focus mainly on the extraction of spatio-temporal visual features without exploiting text or contextual information to further improve the recognition accuracy. Moreover, the ability of deep generative models to effectively model data distribution has not been investigated yet in the field of sign language recognition. To this end, a novel approach for context-aware continuous sign language recognition using a generative adversarial network architecture, named as Sign Language Recognition Generative Adversarial Network (SLRGAN), is introduced. The proposed network architecture consists of a generator that recognizes sign language glosses by extracting spatial and temporal features from video sequences, as well as a discriminator that evaluates the quality of the generator’s predictions by modeling text information at the sentence and gloss levels. The paper also investigates the importance of contextual information on sign language conversations for both Deaf-to-Deaf and Deaf-to-hearing communication. Contextual information, in the form of hidden states extracted from the previous sentence, is fed into the bidirectional long short-term memory module of the generator to improve the recognition accuracy of the network. At the final stage, sign language translation is performed by a transformer network, which converts sign language glosses to natural language text. Our proposed method achieved word error rates of 23.4%, 2.1%, and 2.26% on the RWTH-Phoenix-Weather-2014 and the Chinese Sign Language (CSL) and Greek Sign Language (GSL) Signer Independent (SI) datasets, respectively.

show abstract

Multi-manifold Attention for Vision Transformers

Konstantinidis¹,

Papastratis²,

Dimitropoulos³

et al. 2022

Preprint

View full text Add to dashboard Cite

Indoor Navigation of Quadrotors via Ultra-Wideband Wireless Technology

Papastratis

Charalambous

Παππάς

2018

View full text Add to dashboard Cite

Ablation study of self-supervised learning for image classification

Papastratis¹

2021

Preprint

View full text Add to dashboard Cite

The Greek Sign Language (GSL) Dataset

Papastratis¹,

Adaloglou²,

Stergioulas³

et al. 2020

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.