A compelling example of auditory-visual multisensory integration is the McGurk effect, in which an auditory syllable is perceived very differently depending on whether it is accompanied by a visual movie of a speaker pronouncing the same syllable or a different, incongruent syllable. Anatomical and physiological studies in human and nonhuman primates have suggested that the superior temporal sulcus (STS) is involved in auditory-visual integration for both speech and nonspeech stimuli. We hypothesized that the STS plays a critical role in the creation of the McGurk percept. Because the location of multisensory integration in the STS varies from subject to subject, the location of auditory-visual speech processing in the STS was first identified in each subject with fMRI. Then, activity in this region of the STS was disrupted with single-pulse transcranial magnetic stimulation (
The McGurk effect is a compelling illusion in which humans perceive mismatched audiovisual speech as a completely different syllable. However, some normal individuals do not experience the illusion, reporting that the stimulus sounds the same with or without visual input. Converging evidence suggests that the left superior temporal sulcus (STS) is critical for audiovisual integration during speech perception. We used blood-oxygen level dependent functional magnetic resonance imaging (BOLD fMRI) to measure brain activity as McGurk perceivers and non-perceivers were presented with congruent audiovisual syllables, McGurk audiovisual syllables, and non-McGurk incongruent syllables. The inferior frontal gyrus showed an effect of stimulus condition (greater responses for incongruent stimuli) but not susceptibility group, while the left auditory cortex showed an effect of susceptibility group (greater response in susceptible individuals) but not stimulus condition. Only one brain region, the left STS, showed a significant effect of both susceptibility and stimulus condition. The amplitude of the response in the left STS was significantly correlated with the likelihood of perceiving the McGurk effect: a weak STS response meant that a subject was less likely to perceive the McGurk effect, while a strong response meant that a subject was more likely to perceive it. These results suggest that the left STS is a key locus for interindividual differences in speech perception.
Conventional functional magnetic resonance imaging (FMRI) group analysis makes two key assumptions that are not always justified. First, the data from each subject is condensed into a single number per voxel, under the assumption that within-subject variance for the effect of interest is the same across all subjects or is negligible relative to the cross-subject variance. Second, it is assumed that all data values are drawn from the same Gaussian distribution with no outliers. We propose an approach that does not make such strong assumptions, and present a computationally efficient frequentist approach to FMRI group analysis, which we term mixed-effects multilevel analysis (MEMA), that incorporates both the variability across subjects and the precision estimate of each effect of interest from individual subject analyses. On average, the more accurate tests result in higher statistical power, especially when conventional variance assumptions do not hold, or in the presence of outliers. In addition, various heterogeneity measures are available with MEMA that may assist the investigator in further improving the modeling. Our method allows group effect t-tests and comparisons among conditions and among groups. In addition, it has the capability to incorporate subject-specific covariates such as age, IQ, or behavioral data. Simulations were performed to illustrate power comparisons and the capability of controlling type I errors among various significance testing methods, and the results indicated that the testing statistic we adopted struck a good balance between power gain and type I error control. Our approach is instantiated in an open-source, freely distributed program that may be used on any dataset stored in the universal neuroimaging file transfer (NIfTI) format. To date, the main impediment for more accurate testing that incorporates both within- and cross-subject variability has been the high computational cost. Our efficient implementation makes this approach practical. We recommend its use in lieu of the less accurate approach in the conventional group analysis.
Humans are remarkably adept at understanding speech, even when it is contaminated by noise. Multisensory integration may explain some of this ability: combining independent information from the auditory modality (vocalizations) and the visual modality (mouth movements) reduces noise and increases accuracy. Converging evidence suggests that the superior temporal sulcus (STS) is a critical brain area for multisensory integration, but little is known about its role in the perception of noisy speech. Behavioral studies have shown that perceptual judgments are weighted by the reliability of the sensory modality: more reliable modalities are weighted more strongly, even if the reliability changes rapidly. We hypothesized that changes in the functional connectivity of STS with auditory and visual cortex could provide a neural mechanism for perceptual reliability weighting. To test this idea, we performed five blood oxygenation leveldependent functional magnetic resonance imaging and behavioral experiments in 34 healthy subjects. We found increased functional connectivity between the STS and auditory cortex when the auditory modality was more reliable (less noisy) and increased functional connectivity between the STS and visual cortex when the visual modality was more reliable, even when the reliability changed rapidly during presentation of successive words. This finding matched the results of a behavioral experiment in which the perception of incongruent audiovisual syllables was biased toward the more reliable modality, even with rapidly changing reliability. Changes in STS functional connectivity may be an important neural mechanism underlying the perception of noisy speech.
Children use information from both the auditory and visual modalities to aid in understanding speech. A dramatic illustration of this multisensory integration is the McGurk effect, an illusion in which an auditory syllable is perceived differently when it is paired with an incongruent mouth movement. However, there are significant interindividual differences in McGurk perception: some children never perceive the illusion, while others always do. Because converging evidence suggests that the posterior superior temporal sulcus (STS) is a critical site for multisensory integration, we hypothesized that activity within the STS would predict susceptibility to the McGurk effect. To test this idea, we used blood-oxygen level dependent functional magnetic resonance imaging (BOLD fMRI) in seventeen children aged 6 to 12 years to measure brain responses to three audiovisual stimulus categories: McGurk incongruent, non-McGurk incongruent and congruent syllables. Two separate analysis approaches, one using independent functional localizers and another using whole-brain voxel-based regression, showed differences in the left STS between perceivers and non-perceivers. The STS of McGurk perceivers responded significantly more than non-perceivers to McGurk syllables, but not to other stimuli, and perceivers’ hemodynamic responses in the STS were significantly prolonged. In addition to the STS, weaker differences between perceivers and non-perceivers were observed in the FFA and extrastriate visual cortex. These results suggest that the STS is an important source of interindividual variability in children’s audiovisual speech perception.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.