How humans extract the identity of speech sounds from highly variable acoustic signals remains unclear. Here, we use searchlight representational similarity analysis (RSA) to localize and characterize neural representations of syllables at different levels of the hierarchically organized temporo-frontal pathways for speech perception. We asked participants to listen to spoken syllables that differed considerably in their surface acoustic form by changing speaker and degrading surface acoustics using noise-vocoding and sine wave synthesis while we recorded neural responses with functional magnetic resonance imaging. We found evidence for a graded hierarchy of abstraction across the brain. At the peak of the hierarchy, neural representations in somatomotor cortex encoded syllable identity but not surface acoustic form, at the base of the hierarchy, primary auditory cortex showed the reverse. In contrast, bilateral temporal cortex exhibited an intermediate response, encoding both syllable identity and the surface acoustic form of speech. Regions of somatomotor cortex associated with encoding syllable identity in perception were also engaged when producing the same syllables in a separate session. These findings are consistent with a hierarchical account of how variable acoustic signals are transformed into abstract representations of the identity of speech sounds.
There is much interest in the idea that musicians perform better than non-musicians in understanding speech in background noise. Research in this area has often used energetic maskers, which have their effects primarily at the auditory periphery. However, masking interference can also occur at more central auditory levels, known as informational masking. This experiment extends existing research by using multiple maskers that vary in their informational content and similarity to speech, in order to examine differences in perception of masked speech between trained musicians (n = 25) and non-musicians (n = 25). Although musicians outperformed nonmusicians on a measure of frequency discrimination, they showed no advantage in perceiving masked speech. Further analysis revealed that nonverbal IQ, rather than musicianship, significantly predicted speech reception thresholds in noise. The results strongly suggest that the contribution of general cognitive abilities needs to be taken into account in any investigations of individual variability for perceiving speech in noise.
Humans can generate mental auditory images of voices or songs, sometimes perceiving them almost as vividly as perceptual experiences. The functional networks supporting auditory imagery have been described, but less is known about the systems associated with interindividual differences in auditory imagery. Combining voxel-based morphometry and fMRI, we examined the structural basis of interindividual differences in how auditory images are subjectively perceived, and explored associations between auditory imagery, sensory-based processing, and visual imagery. Vividness of auditory imagery correlated with gray matter volume in the supplementary motor area (SMA), parietal cortex, medial superior frontal gyrus, and middle frontal gyrus. An analysis of functional responses to different types of human vocalizations revealed that the SMA and parietal sites that predict imagery are also modulated by sound type. Using representational similarity analysis, we found that higher representational specificity of heard sounds in SMA predicts vividness of imagery, indicating a mechanistic link between sensory- and imagery-based processing in sensorimotor cortex. Vividness of imagery in the visual domain also correlated with SMA structure, and with auditory imagery scores. Altogether, these findings provide evidence for a signature of imagery in brain structure, and highlight a common role of perceptual–motor interactions for processing heard and internally generated auditory information.
Auditory verbal hallucinations (hearing voices) are typically associated with psychosis, but a minority of the general population also experience them frequently and without distress. Such 'non-clinical' experiences offer a rare and unique opportunity to study hallucinations apart from confounding clinical factors, thus allowing for the identification of symptom-specific mechanisms. Recent theories propose that hallucinations result from an imbalance of prior expectation and sensory information, but whether such an imbalance also influences auditory-perceptual processes remains unknown. We examine for the first time the cortical processing of ambiguous speech in people without psychosis who regularly hear voices. Twelve non-clinical voice-hearers and 17 matched controls completed a functional magnetic resonance imaging scan while passively listening to degraded speech ('sine-wave' speech), that was either potentially intelligible or unintelligible. Voice-hearers reported recognizing the presence of speech in the stimuli before controls, and before being explicitly informed of its intelligibility. Across both groups, intelligible sine-wave speech engaged a typical left-lateralized speech processing network. Notably, however, voice-hearers showed stronger intelligibility responses than controls in the dorsal anterior cingulate cortex and in the superior frontal gyrus. This suggests an enhanced involvement of attention and sensorimotor processes, selectively when speech was potentially intelligible. Altogether, these behavioural and neural findings indicate that people with hallucinatory experiences show distinct responses to meaningful auditory stimuli. A greater weighting towards prior knowledge and expectation might cause non-veridical auditory sensations in these individuals, but it might also spontaneously facilitate perceptual processing where such knowledge is required. This has implications for the understanding of hallucinations in clinical and non-clinical populations, and is consistent with current 'predictive processing' theories of psychosis.
An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400–2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486–2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local “searchlights” and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing.
The question of hemispheric lateralization of neural processes is one that is pertinent to a range of subdisciplines of cognitive neuroscience. Language is often assumed to be left lateralized in the human brain, but there has been a long running debate about the underlying reasons for this. We addressed this problem with fMRI by identifying the neural responses to amplitude and spectral modulations in speech, and how these interact with speech intelligibility, to test previous claims for hemispheric asymmetries in acoustic and linguistic processes in speech perception. We used both univariate and multivariate analyses of the data, which enabled us to both identify the networks involved in processing these acoustic and linguistic factors, and to test the significance of any apparent hemispheric asymmetries. We demonstrate bilateral activation of superior temporal cortex in response to speech-derived acoustic modulations in the absence of intelligibility. However, in a contrast of amplitude-and spectrally-modulated conditions that differed only in their intelligibility (where one was partially intelligible and the other unintelligible), we show a left-dominant pattern of activation in insula, inferior frontal cortex and superior temporal sulcus. Crucially, multivariate pattern analysis (MVPA) showed that there were significant differences between the left and the right hemispheres only in the processing of intelligible speech. This result shows that the left hemisphere dominance in linguistic processing does not arise due to low-level, speech-derived acoustic factors, and that MVPA provides a method for unbiased testing of hemispheric asymmetries in processing.
Spoken conversations typically take place in noisy environments and different kinds of masking sounds place differing demands on cognitive resources. Previous studies, examining the modulation of neural activity associated with the properties of competing sounds, have shown that additional speech streams engage the superior temporal gyrus. However, the absence of a condition in which target speech was heard without additional masking made it difficult to identify brain networks specific to masking and to ascertain the extent to which competing speech was processed equivalently to target speech. In this study, we scanned young healthy adults with continuous functional Magnetic Resonance Imaging (fMRI), whilst they listened to stories masked by sounds that differed in their similarity to speech. We show that auditory attention and control networks are activated during attentive listening to masked speech in the absence of an overt behavioural task. We demonstrate that competing speech is processed predominantly in the left hemisphere within the same pathway as target speech but is not treated equivalently within that stream, and that individuals who perform better in speech in noise tasks activate the left midposterior superior temporal gyrus more. Finally, we identify neural responses associated with the onset of sounds in the auditory environment, activity was found within right lateralised frontal regions consistent with a phasic alerting response. Taken together, these results provide a comprehensive account of the neural processes involved in listening in noise.
To investigate how hearing status, sign language experience, and task demands influence functional responses in the human superior temporal cortices (STC) we collected fMRI data from deaf and hearing participants (male and female), who either acquired sign language early or late in life. Our stimuli in all tasks were pictures of objects. We varied the linguistic and visuospatial processing demands in three different tasks that involved decisions about (1) the sublexical (phonological) structure of the British Sign Language (BSL) signs for the objects, (2) the semantic category of the objects, and (3) the physical features of the objects.Neuroimaging data revealed that in participants who were deaf from birth, STC showed increased activation during visual processing tasks. Importantly, this differed across hemispheres. Right STC was consistently activated regardless of the task whereas left STC was sensitive to task demands. Significant activation was detected in the left STC only for the BSL phonological task. This task, we argue, placed greater demands on visuospatial processing than the other two tasks. In hearing signers, enhanced activation was absent in both left and right STC during all three tasks. Lateralization analyses demonstrated that the effect of deafness was more task-dependent in the left than the right STC whereas it was more task-independent in the right than the left STC. These findings indicate how the absence of auditory input from birth leads to dissociable and altered functions of left and right STC in deaf participants.SIGNIFICANCE STATEMENT Those born deaf can offer unique insights into neuroplasticity, in particular in regions of superior temporal cortex (STC) that primarily respond to auditory input in hearing people. Here we demonstrate that in those deaf from birth the left and the right STC have altered and dissociable functions. The right STC was activated regardless of demands on visual processing. In contrast, the left STC was sensitive to the demands of visuospatial processing. Furthermore, hearing signers, with the same sign language experience as the deaf participants, did not activate the STCs. Our data advance current understanding of neural plasticity by determining the differential effects that hearing status and task demands can have on left and right STC function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.