Sign Languague Recognition Without Frame-Sequencing Constraints: A Proof of Concept on the Argentinian Sign Language

Ronchetti, Franco; Quiroga, Facundo; Estrebou, César Armando; Lanzarini, Laura Cristina; Rosete, Alejandro

doi:10.1007/978-3-319-47955-2_28

Cited by 28 publications

(28 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The evaluation of the various types of features in the performance of our proposed SLR system on the LSA64 dataset, along with a comparison with the SLR methods in [19] is presented on Table 1.…”

Section: Resultsmentioning

confidence: 99%

“…In [19], the authors proposed a SLR system based on the output of two classifiers, one for each hand. The classifier for each hand receives as input a sequence of cropped hand regions and normalized hand positions and employs three sub-classifiers that each use position, movement and hand-shape information.…”

Section: Resultsmentioning

confidence: 99%

“…In this section, we initially present the tested dataset and the experimental setup before we describe the architectural details of the proposed SLR system. Then, we compare our proposed system with the methodology described in [19]. Finally, we analyze and assess the contributions of the various employed features to the performance of the proposed SLR system.…”

Section: Experimental Evaluationmentioning

confidence: 99%

“…This is achieved by employing a spline interpolation technique among the given frames of a video sequence. The experimental setup is based on [19]. More specifically, the dataset is split randomly in a training set consisting of 80% of the samples and a test set consisting of the remaining 20% of the samples.…”

Section: Dataset Description and Experimental Setupmentioning

confidence: 99%

See 3 more Smart Citations

Sign Language Recognition Based on Hand and Body Skeletal Data

Konstantinidis

Dimitropoulos

Daras

2018

2018 - 3dtv-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3dtv-Con)

View full text Add to dashboard Cite

Sign language recognition (SLR) is a challenging, but highly important research field for several computer vision systems that attempt to facilitate the communication among the deaf and hearing impaired people. In this work, we propose an accurate and robust deep learning-based methodology for sign language recognition from video sequences. Our novel method relies on hand and body skeletal features extracted from RGB videos and, therefore, it acquires highly discriminative for gesture recognition skeletal data without the need for any additional equipment, such as data gloves, that may restrict signer's movements. Experimentation on a large publicly available sign language dataset reveals the superiority of our methodology with respect to other state of the art approaches relying solely on RGB features.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

Section: Experimental Evaluationmentioning

confidence: 99%

Section: Dataset Description and Experimental Setupmentioning

confidence: 99%

See 2 more Smart Citations

Sign Language Recognition Based on Hand and Body Skeletal Data

Konstantinidis

Dimitropoulos

Daras

2018

2018 - 3dtv-Conference: The True Vision - Capture, Transmission and Display of 3D Video (3dtv-Con)

View full text Add to dashboard Cite

show abstract

“…Here a probabilistic clustering to correctly classify the 16 possible configurations of the database was used, achieving a simple and powerful recognizer. The second classification model allows the classification of segmented signs in videos [1]. This consists of a probabilistic system based on the information which has been captured from the two hands, where each one evaluates three main components: the position, the configuration…”

mentioning

confidence: 99%

Dynamic Gesture Recognition and its Application to Sign Language

Ronchetti

2017

JC&ST

Self Cite

View full text Add to dashboard Cite

The automatic recognition of human gestures is a complex multidisciplinary problem that has not yet been completely solved. Since the advent of digital video capture technologies, there have been attempts to recognize dynamic gestures for different purposes. In the recent years, new technologies such as depth sensors or highresolution cameras were incorporated as well as the high processing capacity of the current devices emerged, allowing the new technologies development capable of detecting different movements and acting in real time. Unlike the recognition of the spoken voice, which has been researched for more than forty years, the topic of this thesis is relatively new in the scientific area and it evolves rapidly as new devices appear as well as new computer vision algorithms.It is necessary to tackle many different tasks to be able to use an automatic sign language recognition system to translate the interpreter gestures. First, there are different approaches depending on the sensing device to use. Once the gesture is captured, several pre-processing stages are required to identify regions of interest such as the hands and face of the interpreter, and then identify the different trajectories of the performed gesture.The sign language presents a huge variability in the different postures or configurations that a hand can have, which makes this discipline a particularly complex problem. To deal with this, a correct generation of the static and dynamic descriptors is necessary. In addition, because each region has specific language grammars, it is required the provision of an Argentine Sign Language (LSA) database, which has not been available yet. Based on the reasons mentioned above, this thesis aims to develop a complete process of interpretation and translation of the Argentinian Sign Language through videos obtained with an RGB camera.First, a state of the art study about the gesture recognition was carried out. Intelligent techniques for image and video processing as well as the different descriptors types were researched. As a preliminary work, a strategy capable of processing human actions captured with an MS Kinect device [4] was developed. This strategy implements a probabilistic SOM neural network (ProbSOM) with a descriptor specifically designed to retain temporal information. This work allowed to overcome the existing results so far for two recognized databases.As a result of this thesis, two main contributions in the sign language field were made. In the first place, a specific database for the recognition of the Argentinian Sign Language was developed. This included an image database with the 16 configurations most used in the language [3], along with a database of high-resolution videos with 64 different signs, with a total of 3200 videos [2]. These databases were recorded with 10 different interpreters and several repetitions, allowing their use with classic techniques of machine learning. In addition, in these databases, the interpreters have worn colored gloves in the form of a marker. Thi...

show abstract