In everyday life, people interact with each others through verbal communication but also by spontaneous beat gestures which are a very important part of the paralinguistic context during face-to-face conversations. Nonetheless, their role and neural correlates have been seldom addressed. Here we investigate the time course of beat-speech integration in natural speech perception conditions. We measured event-related potentials to words pronounced with or without an accompanying beat gesture while participants attended to a political speech. When the speaker was on sight, words pronounced with a beat gesture elicited appositive shift in ERPs at early sensory (before 100 ms) and at a later time window coinciding with the auditory component P2. This result remained partially true even when the auditory signal was removed from audiovisual signal. Interestingly, there was no difference with words pronounced without gesture when participants listened to the same speech passage without viewing of the speaker. We conclude that in a naturalistic speech context, beat gestures are integrated with speech early on in time and modulate the sensory/phonological levels of processing. We propose that these results suggest a possible role of beats as a highlighter, helping direct the focus of attention of the listener on important information, rather than adding information per se. Beat gestures would modulate how verbal information is treated.
During public addresses, speakers accompany their discourse with spontaneous hand gestures (beats) that are tightly synchronized with the prosodic contour of the discourse. It has been proposed that speech and beat gestures originate from a common underlying linguistic process whereby both speech prosody and beats serve to emphasize relevant information. We hypothesized that breaking the consistency between beats and prosody by temporal desynchronization, would modulate activity of brain areas sensitive to speech-gesture integration. To this aim, we measured BOLD responses as participants watched a natural discourse where the speaker used beat gestures. In order to identify brain areas specifically involved in processing hand gestures with communicative intention, beat synchrony was evaluated against arbitrary visual cues bearing equivalent rhythmic and spatial properties as the gestures. Our results revealed that left MTG and IFG were specifically sensitive to speech synchronized with beats, compared to the arbitrary vision-speech pairing. Our results suggest that listeners confer beats a function of visual prosody, complementary to the prosodic structure of speech. We conclude that the emphasizing function of beat gestures in speech perception is instantiated through a specialized brain network sensitive to the communicative intent conveyed by a speaker with his/her hands.
Speakers often accompany speech with spontaneous beat gestures in natural spoken communication. These gestures are usually aligned with lexical stress and can modulate the saliency of their affiliate words. Here we addressed the consequences of beat gestures on the neural correlates of speech perception. Previous studies have highlighted the role played by theta oscillations in temporal prediction of speech. We hypothesized that the sight of beat gestures may influence ongoing low-frequency neural oscillations around the onset of the corresponding words. Electroencephalographic (EEG) recordings were acquired while participants watched a continuous, naturally recorded discourse. The phase-locking value (PLV) at word onset was calculated from the EEG from pairs of identical words that had been pronounced with and without a concurrent beat gesture in the discourse. We observed an increase in PLV in the 5-6 Hz theta range as well as a desynchronization in the 8-10 Hz alpha band around the onset of words preceded by a beat gesture. These findings suggest that beats help tune low-frequency oscillatory activity at relevant moments during natural speech perception, providing a new insight of how speech and paralinguistic information are integrated.
The interactions between the senses are essential for cognitive functions such as perception, attention, and action planning. Past research helped understanding of multisensory processes in the laboratory. Yet, the efforts to extrapolate these findings to the real-world are scarce. Extrapolation to real-world contexts is important for practical and theoretical reasons. Multisensory phenomena might be expressed differently in real-world settings compared to simpler laboratory situations. Some effects might become stronger, others may disappear, and new outcomes could be discovered. This Element discusses research that uncovers multisensory interactions under complex environments, with an emphasis on the interplay of multisensory mechanisms with other processes.
During natural speech perception, listeners rely on a wide range of cues to support comprehension, from semantic context to prosodic information. There is a general consensus that prosody plays a role in syntactic parsing, but most studies focusing on ambiguous relative clauses (RC) show that prosodic cues, alone, are insufficient to reverse the preferred interpretation of sentence. These findings suggest that universally preferred structures (e.g., Late Closure principle) matter far more than prosodic cues in such cases. This study explores an alternative hypothesis: that the weak effect of prosody might be due to the influence of various syntactic, lexical-semantic, and acoustic confounding factors, and investigate the consequences of prosodic breaks while controlling these variables. We used Spanish RC sentences in three experimental conditions where the presence and position (following the first or second noun phrase) of prosodic breaks was manipulated. The results showed that the placement of a prosodic break determined sentence interpretation by changing the preferred attachment of the RC. Listeners’ natural preference for low attachment (in the absence of break) was reinforced when a prosodic break was placed after the first noun. In contrast, a prosodic break placed after the second noun reversed the preferred interpretation of the sentence, toward high attachment. We argue that, in addition to other factors, listeners indeed use prosodic breaks as robust cues to syntactic parsing during speech processing, as these cues may direct listeners toward one interpretation or another.
During multisensory speech perception, slow d oscillations (;1-3 Hz) in the listener's brain synchronize with the speech signal, likely engaging in speech signal decomposition. Notable fluctuations in the speech amplitude envelope, resounding speaker prosody, temporally align with articulatory and body gestures and both provide complementary sensations that temporally structure speech. Further, d oscillations in the left motor cortex seem to align with speech and musical beats, suggesting their possible role in the temporal structuring of (quasi)-rhythmic stimulation. We extended the role of d oscillations to audiovisual asynchrony detection as a test case of the temporal analysis of multisensory prosody fluctuations in speech. We recorded Electroencephalograph (EEG) responses in an audiovisual asynchrony detection task while participants watched videos of a speaker. We filtered the speech signal to remove verbal content and examined how visual and auditory prosodic features temporally (mis-)align. Results confirm (1) that participants accurately detected audiovisual asynchrony, and (2) increased d power in the left motor cortex in response to audiovisual asynchrony. The difference of d power between asynchronous and synchronous conditions predicted behavioral performance, and (3) decreased d-b coupling in the left motor cortex when listeners could not accurately map visual and auditory prosodies. Finally, both behavioral and neurophysiological evidence was altered when a speaker's face was degraded by a visual mask. Together, these findings suggest that motor d oscillations support asynchrony detection of multisensory prosodic fluctuation in speech.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.