Gestures are often considered to be demonstrative of the embodied nature of the mind (Hostetter and Alibali, 2008). In this article, we review current theories and research targeted at the intra-cognitive role of gestures. We ask the question how can gestures support internal cognitive processes of the gesturer? We suggest that extant theories are in a sense disembodied, because they focus solely on embodiment in terms of the sensorimotor neural precursors of gestures. As a result, current theories on the intra-cognitive role of gestures are lacking in explanatory scope to address how gestures-as-bodily-acts fulfill a cognitive function. On the basis of recent theoretical appeals that focus on the possibly embedded/extended cognitive role of gestures (Clark, 2013), we suggest that gestures are external physical tools of the cognitive system that replace and support otherwise solely internal cognitive processes. That is gestures provide the cognitive system with a stable external physical and visual presence that can provide means to think with. We show that there is a considerable amount of overlap between the way the human cognitive system has been found to use its environment, and how gestures are used during cognitive processes. Lastly, we provide several suggestions of how to investigate the embedded/extended perspective of the cognitive function of gestures.
Note that parts of the current manuscript may overlap verbatim with the pre-registration. Open data & Pre-registration: This study has been pre-registered on the Open Science Framework (OSF: https://osf.io/5aydk/) and the raw anonymized quantitative data, and analyses scripts supporting this confirmatory study are also available on the OSF.
Gesture–speech synchrony re‐stabilizes when hand movement or speech is disrupted by a delayed feedback manipulation, suggesting strong bidirectional coupling between gesture and speech. Yet it has also been argued from case studies in perceptual–motor pathology that hand gestures are a special kind of action that does not require closed‐loop re‐afferent feedback to maintain synchrony with speech. In the current pre‐registered within‐subject study, we used motion tracking to conceptually replicate McNeill's ( 1992 ) classic study on gesture–speech synchrony under normal and 150 ms delayed auditory feedback of speech conditions ( NO DAF vs. DAF ). Consistent with, and extending McNeill's original results, we obtain evidence that (a) gesture‐speech synchrony is more stable under DAF versus NO DAF (i.e., increased coupling effect), (b) that gesture and speech variably entrain to the external auditory delay as indicated by a consistent shift in gesture‐speech synchrony offsets (i.e., entrainment effect), and (c) that the coupling effect and the entrainment effect are co‐dependent. We suggest, therefore, that gesture–speech synchrony provides a way for the cognitive system to stabilize rhythmic activity under interfering conditions.
It is commonly understood that hand gesture and speech coordination in humans is culturally and cognitively acquired, rather than having a biological basis. Recently, however, the biomechanical physical coupling of arm movements to speech vocalization has been studied in steady-state vocalization and monosyllabic utterances, where forces produced during gesturing are transferred onto the tensioned body, leading to changes in respiratory-related activity and thereby affecting vocalization F0 and intensity. In the current experiment (n = 37), we extend this previous line of work to show that gesture-speech physics also impacts fluent speech. Compared with nonmovement, participants who are producing fluent self-formulated speech while rhythmically moving their limbs demonstrate heightened F0 and amplitude envelope, and such effects are more pronounced for higher-impulse arm versus lowerimpulse wrist movement. We replicate that acoustic peaks arise especially during moments of peak impulse (i.e., the beat) of the movement, namely around deceleration phases of the movement. Finally, higher deceleration rates of higher-mass arm movements were related to higher peaks in acoustics. These results confirm a role for physical impulses of gesture affecting the speech system. We discuss the implications of gesture-speech physics for understanding of the emergence of communicative gesture, both ontogenetically and phylogenetically.
There is increasing evidence that hand gestures and speech synchronize their activity on multiple dimensions and timescales. For example, gesture's kinematic peaks (e.g., maximum speed) are coupled with prosodic markers in speech. Such coupling operates on very short timescales at the level of syllables (200 ms), and therefore requires high-resolution measurement of gesture kinematics and speech acoustics. High-resolution speech analysis is common for gesture studies, given that field's classic ties with (psycho)linguistics. However, the field has lagged behind in the objective study of gesture kinematics (e.g., as compared to research on instrumental action). Often kinematic peaks in gesture are measured by eye, where a "moment of maximum effort" is determined by several raters. In the present article, we provide a tutorial on more efficient methods to quantify the temporal properties of gesture kinematics, in which we focus on common challenges and possible solutions that come with the complexities of studying multimodal language. We further introduce and compare, using an actual gesture dataset (392 gesture events), the performance of two video-based motion-tracking methods (deep learning vs. pixel change) against a high-performance wired motion-tracking system (Polhemus Liberty). We show that the videography methods perform well in the temporal estimation of kinematic peaks, and thus provide a cheap alternative to expensive motion-tracking systems. We hope that the present article incites gesture researchers to embark on the widespread objective study of gesture kinematics and their relation to speech. Keywords Motion tracking. Video recording. Deep learning. Gesture and speech analysis. Multimodal language There is an increasing interest in the ways that cooccurring hand gestures and speech coordinate their activity (Esteve-Gibert & Guellaï, 2018; Wagner, Malisz, & Kopp, 2014). The nature of this coordination is exquisitely complex, as it operates on multiple levels and timescales. For example, on the semantic level, referential
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.