Viewing a speaker's articulatory movements substantially improves a listener's ability to understand spoken words, especially under noisy environmental conditions. It has been claimed that this gain is most pronounced when auditory input is weakest, an effect that has been related to a well-known principle of multisensory integration--"inverse effectiveness." In keeping with the predictions of this principle, the present study showed substantial gain in multisensory speech enhancement at even the lowest signal-to-noise ratios (SNRs) used (-24 dB), but it was also evident that there was a "special zone" at a more intermediate SNR of -12 dB where multisensory integration was additionally enhanced beyond the predictions of this principle. As such, we show that inverse effectiveness does not strictly apply to the multisensory enhancements seen during audiovisual speech perception. Rather, the gain from viewing visual articulations is maximal at intermediate SNRs, well above the lowest auditory SNR where the recognition of whole words is significantly different from zero. We contend that the multisensory speech system is maximally tuned for SNRs between extremes, where the system relies on either the visual (speech-reading) or the auditory modality alone, forming a window of maximal integration at intermediate SNR levels. At these intermediate levels, the extent of multisensory enhancement of speech recognition is considerable, amounting to more than a 3-fold performance improvement relative to an auditory-alone condition.
Memory for people and their relationships, along with memory for social language and social behaviors, constitutes a specific type of semantic memory termed social knowledge. This review focuses on how and where social knowledge is represented in the brain. We propose that portions of the anterior temporal lobe (ATL) play a critical role in representing and retrieving social knowledge. This includes memory about people, their names and biographies and more abstract forms of social memory such as memory for traits and social concepts. This hypothesis is based on the convergence of several lines of research including anatomical findings, lesion evidence from both humans and non-human primates and neuroimaging evidence. Moreover, the ATL is closely interconnected with cortical nuclei of the amygdala and orbitofrontal cortex via the uncinate fasciculus. We propose that this pattern of connectivity underlies the function of the ATL in encoding and storing emotionally tagged knowledge that is used to guide orbitofrontal-based decision processes.
Two distinct literatures have emerged on the functionality of the anterior temporal lobes (ATL): in one field, the ATLs are conceived of as a repository for semantic or conceptual knowledge. In another field, the ATLs are thought to play some undetermined role in social-emotional functions such as Theory of Mind. Here we attempted to reconcile these distinct functions by assessing whether social semantic processing can explain ATL activation in other social cognitive tasks. Social semantic functions refer to knowledge about social concepts and rules. In a first experiment we tested the idea that social semantic representations can account for activations in the ATL to social attribution stimuli such as Heider and Simmel animations. Left ATL activations to Heider and Simmel stimuli overlapped with activations to social words. In a second experiment we assessed the putative roles of the ATLs in the processing of narratives and theory of mind content and found evidence for a role of the ATLs in the processing of theory of mind but not narrative per se. These findings indicate that the ATLs are part of a neuronal network supporting social cognition and that they are engaged when tasks demand access to social conceptual knowledge.
Under noisy listening conditions, visualizing a speaker's articulations substantially improves speech intelligibility. This multisensory speech integration ability is crucial to effective communication, and the appropriate development of this capacity greatly impacts a child's ability to successfully navigate educational and social settings. Research shows that multisensory integration abilities continue developing late into childhood. The primary aim here was to track the development of these abilities in children with autism, since multisensory deficits are increasingly recognized as a component of the autism spectrum disorder (ASD) phenotype. The abilities of high-functioning ASD children (n = 84) to integrate seen and heard speech were assessed cross-sectionally, while environmental noise levels were systematically manipulated, comparing them with age-matched neurotypical children (n = 142). Severe integration deficits were uncovered in ASD, which were increasingly pronounced as background noise increased. These deficits were evident in school-aged ASD children (5-12 year olds), but were fully ameliorated in ASD children entering adolescence (13-15 year olds). The severity of multisensory deficits uncovered has important implications for educators and clinicians working in ASD. We consider the observation that the multisensory speech system recovers substantially in adolescence as an indication that it is likely amenable to intervention during earlier childhood, with potentially profound implications for the development of social communication abilities in ASD children.
Watching a speaker's facial movements can dramatically enhance our ability to comprehend words, especially in noisy environments. From a general doctrine of combining information from different sensory modalities (the principle of inverse effectiveness), one would expect that the visual signals would be most effective at the highest levels of auditory noise. In contrast, we find, in accord with a recent paper, that visual information improves performance more at intermediate levels of auditory noise than at the highest levels, and we show that a novel visual stimulus containing only temporal information does the same. We present a Bayesian model of optimal cue integration that can explain these conflicts. In this model, words are regarded as points in a multidimensional space and word recognition is a probabilistic inference process. When the dimensionality of the feature space is low, the Bayesian model predicts inverse effectiveness; when the dimensionality is high, the enhancement is maximal at intermediate auditory noise levels. When the auditory and visual stimuli differ slightly in high noise, the model makes a counterintuitive prediction: as sound quality increases, the proportion of reported words corresponding to the visual stimulus should first increase and then decrease. We confirm this prediction in a behavioral experiment. We conclude that auditory-visual speech perception obeys the same notion of optimality previously observed only for simple multisensory stimuli.
Observing a speaker’s articulations substantially improves intelligibility of spoken speech, especially under noisy listening conditions. This multisensory integration of speech inputs is crucial to effective communication. Appropriate development of this ability has major implications for children in classroom and social settings, and deficits in it have been linked to a number of neurodevelopmental disorders, especially autism. It is clear from structural imaging studies that there is a prolonged maturational course within regions of the perisylvian cortex that persists into late childhood, and these regions have been firmly established as crucial to speech-language functions. Given this protracted maturational timeframe, we reasoned that multisensory speech processing might well show a similarly protracted developmental course. Previous work in adults has shown that audiovisual enhancement in word recognition is most apparent within a restricted range of signal-to-noise ratios. Here we asked when these properties emerge during childhood by testing multisensory speech recognition abilities in typically developing children aged between 5 and 14, comparing them to adults. By parametrically varying signal-to-noise ratios (SNRs), we found that children benefited significantly less from observing visual articulations, displaying a considerably less audiovisual enhancement. The findings suggest that improvement in the ability to recognize speech-in-noise and in audiovisual integration during speech perception continues quite late into the childhood years. The implication is that a considerable amount of multisensory learning remains to be achieved during the later schooling years and that explicit efforts to accommodate this learning may well be warranted.
The neural processing of biological motion (BM) is of profound experimental interest since it is often through the movement of another that we interpret their immediate intentions. Neuroimaging points to a specialized cortical network for processing biological motion. Here, high-density electrical mapping and source-analysis techniques were employed to interrogate the timing of information processing across this network. Participants viewed point-light-displays depicting standard body movements (e.g. jumping), while event-related potentials (ERPs) were recorded and compared to ERPs to scrambled motion control stimuli. In a pair of experiments, three major phases of BM-specific processing were identified: 1) The earliest phase of BM-sensitive modulation was characterized by a positive shift of the ERP between 100 and 200 ms after stimulus onset. This modulation was observed exclusively over the right hemisphere and source-analysis suggested a likely generator in close proximity to regions associated with general motion processing (KO/hMT). 2) The second phase of BM-sensitivity occurred from 200 to 350 ms, characterized by a robust negative-going ERP modulation over posterior middle temporal regions bilaterally. Source-analysis pointed to bilateral generators at or near the posterior superior temporal sulcus (STS). 3) A third phase of processing was evident only in our second experiment, where participants actively attended the BM aspect of the stimuli, and was manifest as a centro-parietal positive ERP deflection, likely related to later cognitive processes. These results point to very early sensory registration of biological motion, and highlight the interactive role of the posterior STS in analyzing the movements of other living organisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.