Recent studies have demonstrated the power of recurrent neural networks for machine translation, image captioning and speech recognition. For the task of capturing temporal structure in video, however, there still remain numerous open research questions. Current research suggests using a simple temporal feature pooling strategy to take into account the temporal aspect of video. We demonstrate that this method is not sufficient for gesture recognition, where temporal information is more discriminative compared to general video classification tasks. We explore deep architectures for gesture recognition in video and propose a new end-to-end trainable neural network architecture incorporating temporal convolutions and bidirectional recurrence. Our main contributions are twofold; first, we show that recurrence is crucial for this task; second, we show that adding temporal convolutions leads to significant improvements. We evaluate the different approaches on the Montalbano gesture recognition dataset, where we achieve state-of-the-art results.
Gesture and sign language recognition in a continuous video stream is a challenging task, especially with a large vocabulary. In this work, we approach this as a framewise classification problem. We tackle it using temporal convolutions and recent advances in the deep learning field like residual networks, batch normalization and exponential linear units (ELUs). The models are evaluated on three different
This paper reports on a comparison of word order issues, and more specifically on the order of the verb and its arguments, in two unrelated sign languages: South African Sign Language and Flemish Sign Language. The study comprises the first part of a larger project in which a number of grammatical mechanisms and structures are compared across the two sign languages, using a corpus consisting of similar VGT and SASL-data of a various nature. The overall goal of the project is to contribute to a further understanding of the issue of the degree of similarity across unrelated sign languages. However, the different studies also mean a further exploration of the grammars of the two languages involved. In this paper the focus is on the analysis of isolated declarative sentences elicited by means of pictures. The results yield some interesting similarities across all signers but also indicate that — especially with regard to constituent order — there are important differences between the two languages.
One way of increasing caregivers’ language accessibility when interacting with a deaf child is through visual communication strategies. By using both a longitudinal and cross-sectional approach, this study will reveal which strategies deaf and hearing parents prefer and implement in their daily communication with their deaf children. First, the interactions of one deaf and two hearing mothers with their deaf children were recorded over the course of 18 months starting when their children were 6 months of age. Second, interactions of 5 mothers and 5 fathers (i.e. each two deaf and three hearing) with their deaf children (24 months old) were analysed for implicit and explicit strategy-use. It indicated gender related differences and confirmed caregivers’ tendencies to rely on strategies closely related to the modality of their mother tongue. Finally, deaf parents outperformed the hearing parents in the duration of successful interaction moments with their deaf children.
Automatic sign language recognition lies at the intersection of natural language processing (NLP) and computer vision. The highly successful transformer architectures, based on multi-head attention, originate from the field of NLP. The Video Transformer Network (VTN) is an adaptation of this concept for tasks that require video understanding, e.g., action recognition. However, due to the limited amount of labeled data that is commonly available for training automatic sign (language) recognition, the VTN cannot reach its full potential in this domain. In this work, we reduce the impact of this data limitation by automatically preextracting useful information from the sign language videos. In our approach, different types of information are offered to a VTN in a multi-modal setup. It includes per-frame human pose keypoints (extracted by OpenPose) to capture the body movement and hand crops to capture the (evolution of) hand shapes. We evaluate our method on the recently released AUTSL dataset for isolated sign recognition and obtain 92.92% accuracy on the test set using only RGB data. For comparison: the VTN architecture without hand crops and pose flow achieved 82% accuracy. A qualitative inspection of our model hints at further potential of multi-modal multi-head attention in a sign language recognition context.
Hearing parents of deaf or partially deaf infants are confronted with the complex question of communication with their child. This question is complicated further by conflicting advice on how to address the child: in spoken language only, in spoken language supported by signs, or in signed language. This paper studies the linguistic environment created by one such mother (language input and parental behaviour) and her child's language production longitudinally during the first two years of life of the infant to discover possible relationships. The mother-child dyad was observed when the child was seven, nine, twelve, eighteen, and twenty-four months old. Changes in the mother's approach to communication with her child and their consequent effects on the child's language development will be highlighted. The infant concerned has a hearing loss of more than 90dB on both ears, which qualified her for cochlear implantation. At the age of ten months she was implanted on her left side (30/04/2010). Five months later she received a second implant (24/09/2010). By means of several assessments instruments the created linguistic environment, the language development of the infant in question and possible causal relationships were investigated before and after implantation. These instruments include: Pragmatics Profile of Everyday Communication; Profile of Actual Linguistic Skills; video-images of interaction analysed in ELAN; MacArthur-Bates Communicative Development Inventory for spoken Dutch and Flemish Sign Language (from nine months onwards). Results for each individual assessment moment are given as well as an overarching interpretation of evolution in the language development. The child seems to be profiting from a bimodal/bilingual approach to communication up to nine months of age. She is progressing considerably in both spoken Dutch and Flemish Sign Language, with a possible onset of functional code-switch. However, a setback is evidenced in the child's language development, mirrored in a setback in the mother's sensitive behaviour as she moves to a more monolingual approach after cochlear implantation. Highlights: This paper presents a study of communication between hearing mother and deaf child. This will be done longitudinally before and after cochlear implantation (CI). Benefit from bimodal/bilingual approach is apparent up to 0;9 before CI. After CI (0;10) there seems to be a setback in linguistic behaviour of the dyad.
The first information parents receive after referral through Universal Newborn Hearing Screening (UNHS) has significant consequences for later care-related decisions they take and thus for the future of the child with a hearing loss. In this study, 11 interviews were conducted with a representative sample of Flemish service providers to discover (a) the content of the information provided to parents and (b) the service providers' assumptions and beliefs concerning deafness and care. To do this, we conducted an interpretative phenomenological analysis, followed by a discourse analysis. Results showed that parents receive diverse information, depending on the reference center to which they are referred. Moreover, all service providers used a medical discourse. We suggest that there is value to be gained from closer consideration of the nature of follow-up services provided in response to UNHS in Flanders and from auditing the professional preparation of service providers that are involved in providing information to parents.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.