Our ability to selectively attend to one auditory signal amidst competing input streams, epitomized by the ‘Cocktail Party’ problem, continues to stimulate research from various approaches. How this demanding perceptual feat is achieved from a neural systems perspective remains unclear and controversial. It is well established that neural responses to attended stimuli are enhanced compared to responses to ignored ones, but responses to ignored stimuli are nonetheless highly significant, leading to interference in performance. We investigated whether congruent visual input of an attended speaker enhances cortical selectivity in auditory cortex, leading to diminished representation of ignored stimuli. We recorded magnetoencephalographic (MEG) signals from human participants as they attended to segments of natural continuous speech. Using two complementary methods of quantifying the neural response to speech, we found that viewing a speaker’s face enhances the capacity of auditory cortex to track the temporal speech envelope of that speaker. This mechanism was most effective in a ‘Cocktail Party’ setting, promoting preferential tracking of the attended speaker, whereas without visual input no significant attentional modulation was observed. These neurophysiological results underscore the importance of visual input in resolving perceptual ambiguity in a noisy environment. Since visual cues in speech precede the associated auditory signals, they likely serve a predictive role in facilitating auditory processing of speech, perhaps by directing attentional resources to appropriate points in time when to-be-attended acoustic input is expected to arrive.
Historically, the study of speech processing has emphasized a strong link between auditory perceptual input and motor production output1–4. A kind of ‘parity’ is essential, as both perception- and production-based representations must form a unified interface to facilitate access to higher order language processes such as syntax and semantics, believed to be computed in the dominant, typically left hemisphere5,6. While various theories have been proposed to unite perception and production2,7, the underlying neural mechanisms are unclear. Early models of speech and language processing proposed that perceptual processing occurred in the left posterior superior temporal gyrus (Wernicke’s area) and motor production processes occurred in the left inferior frontal gyrus (Broca’s area)8,9. Sensory activity was proposed to link to production activity via connecting fiber tracts, forming the left lateralized speech sensory-motor system10. While recent evidence indicates that speech perception occurs bilaterally11–13, prevailing models maintain that the speech sensory-motor system is left lateralized11,14–18 and facilitates the transformation from sensory-based auditory representations to motor-based production representations11,15,16. Evidence for the lateralized computation of sensory-motor speech transformations is, however, indirect and primarily comes from lesion patients with speech repetition deficits (conduction aphasia) and studies using covert speech and hemodynamic functional imaging16,19. Whether the speech sensory-motor system is lateralized like higher order language processes, or bilateral, like speech perception is controversial. Here, using direct neural recordings in subjects performing sensory-motor tasks involving overt speech production, we show that sensory-motor transformations occur bilaterally. We demonstrate that electrodes over bilateral inferior frontal, inferior parietal, superior temporal, premotor, and somatosensory cortices exhibit robust sensory-motor neural responses during both perception and production in an overt word repetition task. Using a non-word transformation task, we show that bilateral sensory-motor responses can perform transformations between speech perception- and production-based representations. These results establish a bilateral sublexical speech sensory-motor system.
Recent work has implicated low-frequency (<20 Hz) neuronal phase information as important for both auditory (<10 Hz) and speech [theta (∼4-8 Hz)] perception. Activity on the timescale of theta corresponds linguistically to the average length of a syllable, suggesting that information within this range has consequences for segmentation of meaningful units of speech. Longer timescales that correspond to lower frequencies [delta (1-3 Hz)] also reflect important linguistic features-prosodic/suprasegmental-but it is unknown whether the patterns of activity in this range are similar to theta. We investigate low-frequency activity with magnetoencephalography (MEG) and mutual information (MI), an analysis that has not yet been applied to noninvasive electrophysiological recordings. We find that during speech perception each frequency subband examined [delta (1-3 Hz), theta(low) (3-5 Hz), theta(high) (5-7 Hz)] processes independent information from the speech stream. This contrasts with hypotheses that either delta and theta reflect their corresponding linguistic levels of analysis or each band is part of a single holistic onset response that tracks global acoustic transitions in the speech stream. Single-trial template-based classifier results further validate this finding: information from each subband can be used to classify individual sentences, and classifier results that utilize the combination of frequency bands provide better results than single bands alone. Our results suggest that during speech perception low-frequency phase of the MEG signal corresponds to neither abstract linguistic units nor holistic evoked potentials but rather tracks different aspects of the input signal. This study also validates a new method of analysis for noninvasive electrophysiological recordings that can be used to formally characterize information content of neural responses and interactions between these responses. Furthermore, it bridges results from different levels of neurophysiological study: small-scale multiunit recordings and local field potentials and macroscopic magneto/electrophysiological noninvasive recordings.
Objective. Brain functions such as perception, motor control, learning, and memory arise from the coordinated activity of neuronal assemblies distributed across multiple brain regions. While major progress has been made in understanding the function of individual neurons, circuit interactions remain poorly understood. A fundamental obstacle to deciphering circuit interactions is the limited availability of research tools to observe and manipulate the activity of large, distributed neuronal populations in humans. Here we describe the development, validation, and dissemination of flexible, high-resolution, thin-film (TF) electrodes for recording neural activity in animals and humans. Approach. We leveraged standard flexible printed-circuit manufacturing processes to build high-resolution TF electrode arrays. We used biocompatible materials to form the substrate (liquid crystal polymer; LCP), metals (Au, PtIr, and Pd), molding (medical-grade silicone), and 3D-printed housing (nylon). We designed a custom, miniaturized, digitizing headstage to reduce the number of cables required to connect to the acquisition system and reduce the distance between the electrodes and the amplifiers. A custom mechanical system enabled the electrodes and headstages to be pre-assembled prior to sterilization, minimizing the setup time required in the operating room. PtIr electrode coatings lowered impedance and enabled stimulation. High-volume, commercial manufacturing enables cost-effective production of LCP-TF electrodes in large quantities. Main Results. Our LCP-TF arrays achieve 25× higher electrode density, 20× higher channel count, and 11× reduced stiffness than conventional clinical electrodes. We validated our LCP-TF electrodes in multiple human intraoperative recording sessions and have disseminated this technology to >10 research groups. Using these arrays, we have observed high-frequency neural activity with sub-millimeter resolution. Significance. Our LCP-TF electrodes will advance human neuroscience research and improve clinical care by enabling broad access to transformative, high-resolution electrode arrays.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.