The Encoding of Speech Sounds in the Superior Temporal Gyrus

Yi, Han Gyol; Leonard, Matthew K.; Chang, Edward F.

doi:10.1016/j.neuron.2019.04.023

Cited by 262 publications

(202 citation statements)

References 176 publications

Supporting

Mentioning

175

Contrasting

Unclassified

Order By: Relevance

“…Mechanisms must exist to rapidly decode and bind phonetic elements and their temporal boundaries in a sequential and hierarchical manner. Studies in humans have made great advancements and fundamentally shape how we think about language processing (Yi et al, 2019). For decades it was unclear what was the most basic unit of speech that is decoded by neurons in A1.…”

Section: Circuit Foundation Of Phoneme Detectionmentioning

confidence: 99%

Using Neural Circuit Interrogation in Rodents to Unravel Human Speech Decoding

Neophytou

Oviedo

2020

Front. Neural Circuits

View full text Add to dashboard Cite

The neural circuits responsible for social communication are among the least understood in the brain. Human studies have made great progress in advancing our understanding of the global computations required for processing speech, and animal models offer the opportunity to discover evolutionarily conserved mechanisms for decoding these signals. In this review article, we describe some of the most well-established speech decoding computations from human studies and describe animal research designed to reveal potential circuit mechanisms underlying these processes. Human and animal brains must perform the challenging tasks of rapidly recognizing, categorizing, and assigning communicative importance to sounds in a noisy environment. The instructions to these functions are found in the precise connections neurons make with one another. Therefore, identifying circuit-motifs in the auditory cortices and linking them to communicative functions is pivotal. We review recent advances in human recordings that have revealed the most basic unit of speech decoded by neurons is a phoneme, and consider circuit-mapping studies in rodents that have shown potential connectivity schemes to achieve this. Finally, we discuss other potentially important processing features in humans like lateralization, sensitivity to fine temporal features, and hierarchical processing. The goal is for animal studies to investigate neurophysiological and anatomical pathways responsible for establishing behavioral phenotypes that are shared between humans and animals. This can be accomplished by establishing cell types, connectivity patterns, genetic pathways and critical periods that are relevant in the development and function of social communication.

show abstract

Section: Circuit Foundation Of Phoneme Detectionmentioning

confidence: 99%

Using Neural Circuit Interrogation in Rodents to Unravel Human Speech Decoding

Neophytou

Oviedo

2020

Front. Neural Circuits

View full text Add to dashboard Cite

show abstract

“…The relevant predictive cues in our REG scenes involve rapidly unfolding information (over several concurrent streams) that precludes overt perceptual tracking and likely engages automatic statistical tracking mechanisms (Sohoglu and Chait, 2016b). Importantly, the rates used here-3-25Hzare all within the range that is considered to be most critical for hearing in natural environments (Kayser, 2019;Overath et al, 2015;Teng et al, 2017;Yi et al, 2019). That older listeners exhibited a benefit of regularity therefore indicates that the capacity to extract rapid temporal structure is largely maintained with healthy aging.…”

Section: Older Listeners Demonstrate a Largely Preserved Sensitivitymentioning

confidence: 95%

The effect of healthy aging on change detection and sensitivity to predictable sturcture in crowded acoustic scenes

Kerangal

Vickers

Chait³

2020

Preprint

View full text Add to dashboard Cite

Acknowledgments: This project was supported by a PhD studentship from Action on Hearing Loss and a BBSRC project grant (BB/P003745/1) to MC. DV is supported by Medical Research Senior Fellowship (MR/S002537/1). We are grateful to Brian Moore and Stuart Rosen for advice and discussion. The auditory system plays a critical role in supporting our ability to detect abrupt changes in our surroundings. Here we study how this capacity is affected in the course of healthy ageing. Artifical acoustic 'scenes', populated by multiple concurrent streams of pure tones ('sources') were used to capture the challenges of listening in complex acoustic environments. Two scene conditions were included: REG scenes consisted of sources characterized by a regular temporal structure. Matched RAND scenes contained sources which were temporally random. Changes, manifested as the abrupt disappearance of one of the sources, were introduced to a subset of the trials and participants ('young' group N=41, age 20-38 years; 'older' group N=41, age 60-82 years) were instructed to monitor the scenes for these events. Previous work demonstrated that young listeners exhibit better change detection performance in REG scenes, reflecting sensitivity to temporal structure. Here we sought to determine: (1) Whether 'baseline' change detection ability (i.e. in RAND scenes) is affected by age. (2) Whether aging affects listeners' sensitivity to temporal regularity. (3) How change detection capacity relates to listeners' hearing and cognitive profile. The results demonstrated that healthy aging is associated with reduced sensitivity to abrupt scene changes in RAND scenes but that performance does not correlate with age or standard audiological measures such as pure tone audiometry or speech in noise performance. Remarkably older listeners' change detection performance improved substantially (up to the level exhibited by young listeners) in REG relative to RAND scenes. This suggests that the capacity to extract and track the regularity associated with scene sources, even in crowded acoustic environments, is relatively preserved in older listeners.

show abstract

“…This seems difficult to explain based on considering acoustic features alone and seems consistent with the idea of visual articulations influencing the categorization of phonemes (Holt and Lotto, 2010). More generally, we take this as a further contribution to a growing body of evidence for phonological representations in cortical recordings to naturalistic speech (Brodbeck et al, 2018;Di Liberto et al, 2015;Gwilliams et al, 2020;Khalighinejad et al, 2017;Yi et al, 2019).…”

Section: Enhanced Multisensory Integration Effects At the Phonetic-lementioning

confidence: 99%

Neurophysiological indices of audiovisual speech integration are enhanced at the phonetic level for speech in noise

O’Sullivan

Crosse

Liberto

et al. 2020

Preprint

View full text Add to dashboard Cite

Seeing a speaker's face benefits speech comprehension, especially in challenging listening conditions. This perceptual benefit is thought to stem from the neural integration of visual and auditory speech at multiple stages of processing, whereby movement of a speaker's face provides temporal cues to auditory cortex, and articulatory information from the speaker's mouth can aid recognizing specific linguistic units (e.g., phonemes, syllables). However it remains unclear how the integration of these cues varies as a function of listening conditions. Here we sought to provide insight on these questions by examining EEG responses to natural audiovisual, audio, and visual speech in quiet and in noise.Specifically, we represented our speech stimuli in terms of their spectrograms and their phonetic features, and then Significance Statement During conversation, visual cues impact our perception of speech. Integration of auditory and visual speech is thought to occur at multiple stages of speech processing and vary flexibly depending on the listening conditions.Here we examine audiovisual integration at two stages of speech processing using the speech spectrogram and a phonetic representation, and test how audiovisual integration adapts to degraded listening conditions. We find significant integration at both of these stages regardless of listening conditions, and when the speech is noisy, we find enhanced integration at the phonetic stage of processing. These findings provide support for the multistage integration framework and demonstrate its flexibility in terms of a greater reliance on visual articulatory information in challenging listening conditions.

show abstract

The Encoding of Speech Sounds in the Superior Temporal Gyrus

Cited by 262 publications

References 176 publications

Using Neural Circuit Interrogation in Rodents to Unravel Human Speech Decoding

Using Neural Circuit Interrogation in Rodents to Unravel Human Speech Decoding

The effect of healthy aging on change detection and sensitivity to predictable sturcture in crowded acoustic scenes

Neurophysiological indices of audiovisual speech integration are enhanced at the phonetic level for speech in noise

Contact Info

Product

Resources

About