During speech perception, linguistic elements such as consonants and vowels are extracted from a complex acoustic speech signal. The superior temporal gyrus (STG) participates in high-order auditory processing of speech, but how it encodes phonetic information is poorly understood. We used high-density direct cortical surface recordings in humans while they listened to natural, continuous speech to reveal the STG representation of the entire English phonetic inventory. At single electrodes, we found response selectivity to distinct phonetic features. Encoding of acoustic properties was mediated by a distributed population response. Phonetic features could be directly related to tuning for spectrotemporal acoustic cues, some of which were encoded in a nonlinear fashion or by integration of multiple cues. These findings demonstrate the acoustic-phonetic representation of speech in human STG.
Speaking is one of the most complex actions we perform, yet nearly all of us learn to do it effortlessly. Production of fluent speech requires the precise, coordinated movement of multiple articulators (e.g., lips, jaw, tongue, larynx) over rapid time scales. Here, we used high-resolution, multi-electrode cortical recordings during the production of consonant-vowel syllables to determine the organization of speech sensorimotor cortex in humans. We found speech articulator representations that were somatotopically arranged on ventral pre- and post-central gyri and partially overlapping at individual electrodes. These representations were temporally coordinated as sequences during syllable production. Spatial patterns of cortical activity revealed an emergent, population-level representation, which was organized by phonetic features. Over tens of milliseconds, the spatial patterns transitioned between distinct representations for different consonants and vowels. These results reveal the dynamic organization of speech sensorimotor cortex during the generation of multi-articulator movements underlying our ability to speak.
Speech perception requires the rapid and effortless extraction of meaningful phonetic information from a highly variable acoustic signal. A powerful example of this phenomenon is categorical speech perception, in which a continuum of acoustically varying sounds is transformed into perceptually distinct phoneme categories. Here we show that the neural representation of speech sounds is categorically organized in the human posterior superior temporal gyrus. Using intracranial high-density cortical surface arrays, we found that listening to synthesized speech stimuli varying in small and acoustically equal steps evoked distinct and invariant cortical population response patterns that were organized by their sensitivities to critical acoustic features. Phonetic category boundaries were similar between neurometric and psychometric functions. While speech-sound responses were distributed, spatially discrete cortical loci were found to underlie specific phonetic discrimination. Thus, we demonstrate direct evidence for acoustic-to-higher order phonetic level encoding of speech sounds in human language receptive cortex.
Two sets of data are discussed in terms of an exemplar resonance model of the lexicon. First, a cross-linguistic review of vowel formant measurements indicate that phonetic differences between male and female talkers are a function of language, dissociated to a certain extent from vocal tract length.Second, an auditory word recognition study (Strand, 2000) indicates that listeners can process words faster when the talker has a stereotypical sounding voice. An exemplar resonance model of perception derives these effects suggesting that reentrant pathways (Edelman, 1987) between cognitive categories and detailed exemplars of them leads to the emergence of social and linguistic entities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.