How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

Duarte, Amanda; Palaskar, Shruti; Ventura, Lucas; Ghadiyaram, Deepti; DeHaan, Kenneth; Metze, Florian; Torres, Jordi; Giró-i-Nieto, Xavier

doi:10.1109/cvpr46437.2021.00276

Cited by 72 publications

(37 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We chose to work with a German sign language since that is the only dataset with gloss annotation that could help us study our hypotheses. The How2Sign dataset (Duarte et al, 2021) is a feasible dataset for ASL, but it does not allow any model to extract facial landmarks, facial action units or facial expression from the original video frames since the faces are blurred. In the future, we hope to see new datasets with better and more diverse annotations…”

Section: Discussionmentioning

confidence: 99%

Including Facial Expressions in Contextual Embeddings for Sign Language Generation

Viegas¹,

İnan²,

Quandt³

et al. 2022

Preprint

View full text Add to dashboard Cite

State-of-the-art sign language generation frameworks lack expressivity and naturalness which is the result of only focusing manual signs, neglecting the affective, grammatical and semantic functions of facial expressions. The purpose of this work is to augment semantic representation of sign language through grounding facial expressions. We study the effect of modeling the relationship between text, gloss, and facial expressions on the performance of the sign generation systems.In particular, we propose a Dual Encoder Transformer able to generate manual signs as well as facial expressions by capturing the similarities and differences found in text and sign gloss annotation. We take into consideration the role of facial muscle activity to express intensities of manual signs by being the first to employ facial action units in sign language generation. We perform a series of experiments showing that our proposed model improves the quality of automatically generated sign language.

show abstract

Section: Discussionmentioning

confidence: 99%

Including Facial Expressions in Contextual Embeddings for Sign Language Generation

Viegas¹,

İnan²,

Quandt³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…The collection and annotation of sign language data is an expensive task that needs the collaboration of linguistic experts and native speakers. While there are some publicly available datasets for SLP [4,8,13,14,21,44,50], they suffer from weakly annotated data for sign language. Furthermore, most of the available datasets in SLP contain a restricted domain of the vocabularies/sentences.…”

Section: Discussionmentioning

confidence: 99%

“…In such dataset, a paired form of the continuous sign language sentence and the corresponding spoken language sentence needs to be included. Just a few datasets meet these criteria [11,21,44,84] . The point is that most of the aforementioned datasets cannot be used for end-to-end translation [11,44,84].…”

Section: Discussionmentioning

confidence: 99%

“…While there are some large-scale and annotated datasets available for sign language recognition, there are only few publicly available large-scale datasets for SLP. Two public datasets, RWTH-Phoenix-2014T [13] and How2Sign [21] are the most used datasets in sign language translation. The former includes German sign language sentences that can be used for text-to-sign language translation.…”

Section: Datasetsmentioning

confidence: 99%

See 1 more Smart Citation

Sign Language Production: A Review

Rastgoo¹,

Kiani²,

Escalera³

et al. 2021

Preprint

View full text Add to dashboard Cite

Sign Language is the dominant yet non-primary form of communication language used in the deaf and hearingimpaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental. To this end, sign language recognition and production are two necessary parts for making such a two-way system. Sign language recognition and production need to cope with some critical challenges. In this survey, we review recent advances in Sign Language Production (SLP) and related areas using deep learning. This survey aims to briefly summarize recent achievements in SLP, discussing their advantages, limitations, and future directions of research.

show abstract

“…While there are some large-scale and annotated datasets available for sign language recognition [20], there are only a few publicly available large-scale datasets for SLP. Two public datasets, RWTH-Phoenix-2014T [44] and How2Sign [45] are the most used datasets in sign language translation. The former includes German sign language sentences that can be used for text-to-sign language translation.…”

Section: Datasetsmentioning

confidence: 99%

All You Need In Sign Language Production

Rastgoo¹,

Kiani²,

Escalera³

et al. 2022

Preprint

View full text Add to dashboard Cite

Sign Language is the dominant form of communication language used in the deaf and hearing-impaired community. To make an easy and mutual communication between the hearing-impaired and the hearing communities, building a robust system capable of translating the spoken language into sign language and vice versa is fundamental. To this end, sign language recognition and production are two necessary parts for making such a two-way system. Sign language recognition and production need to cope with some critical challenges. In this survey, we review recent advances in Sign Language Production (SLP) and related areas using deep learning. To have more realistic perspectives to sign language, we present an introduction to the Deaf culture, Deaf centers, psychological perspective of sign language, the main differences between spoken language and sign language. Furthermore, we present the fundamental components of a bi-directional sign language translation system, discussing the main challenges in this area. Also, the backbone architectures and methods in SLP are briefly introduced and the proposed taxonomy on SLP is presented. Finally, a general framework for SLP and performance evaluation, and also a discussion on the recent developments, advantages, and limitations in SLP, commenting on possible lines for future research are presented.

show abstract

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

Cited by 72 publications

References 20 publications

Including Facial Expressions in Contextual Embeddings for Sign Language Generation

Including Facial Expressions in Contextual Embeddings for Sign Language Generation

Sign Language Production: A Review

All You Need In Sign Language Production

Contact Info

Product

Resources

About