Abstract:The human auditory system is exceptional at comprehending an individual speaker even in complex acoustic environments. Because the inner ear, or cochlea, possesses an active mechanism that can be controlled by subsequent neural processing centers through descending nerve fibers, it may already contribute to speech processing. The cochlear activity can be assessed by recording otoacoustic emissions (OAEs), but employing these emissions to assess speech processing in the cochlea is obstructed by the complexity o… Show more
“…Interestingly, activation of the MOC reflex was observed for natural speech—further evidence that activation is not limited to tones and broadband noise [ 118 – 120 ]—and did not depend on whether participants were required to attend in a lexical decision task. This is consistent with natural speech being particularly salient as an ethologically relevant and nondegraded stimulus, as well as the low attentional load required when passively watching a film permitting continued monitoring of unattended speech [ 121 ].…”
The ability to navigate “cocktail party” situations by focusing on sounds of interest over irrelevant, background sounds is often considered in terms of cortical mechanisms. However, subcortical circuits such as the pathway underlying the medial olivocochlear (MOC) reflex modulate the activity of the inner ear itself, supporting the extraction of salient features from auditory scene prior to any cortical processing. To understand the contribution of auditory subcortical nuclei and the cochlea in complex listening tasks, we made physiological recordings along the auditory pathway while listeners engaged in detecting non(sense) words in lists of words. Both naturally spoken and intrinsically noisy, vocoded speech—filtering that mimics processing by a cochlear implant (CI)—significantly activated the MOC reflex, but this was not the case for speech in background noise, which more engaged midbrain and cortical resources. A model of the initial stages of auditory processing reproduced specific effects of each form of speech degradation, providing a rationale for goal-directed gating of the MOC reflex based on enhancing the representation of the energy envelope of the acoustic waveform. Our data reveal the coexistence of 2 strategies in the auditory system that may facilitate speech understanding in situations where the signal is either intrinsically degraded or masked by extrinsic acoustic energy. Whereas intrinsically degraded streams recruit the MOC reflex to improve representation of speech cues peripherally, extrinsically masked streams rely more on higher auditory centres to denoise signals.
“…Interestingly, activation of the MOC reflex was observed for natural speech—further evidence that activation is not limited to tones and broadband noise [ 118 – 120 ]—and did not depend on whether participants were required to attend in a lexical decision task. This is consistent with natural speech being particularly salient as an ethologically relevant and nondegraded stimulus, as well as the low attentional load required when passively watching a film permitting continued monitoring of unattended speech [ 121 ].…”
The ability to navigate “cocktail party” situations by focusing on sounds of interest over irrelevant, background sounds is often considered in terms of cortical mechanisms. However, subcortical circuits such as the pathway underlying the medial olivocochlear (MOC) reflex modulate the activity of the inner ear itself, supporting the extraction of salient features from auditory scene prior to any cortical processing. To understand the contribution of auditory subcortical nuclei and the cochlea in complex listening tasks, we made physiological recordings along the auditory pathway while listeners engaged in detecting non(sense) words in lists of words. Both naturally spoken and intrinsically noisy, vocoded speech—filtering that mimics processing by a cochlear implant (CI)—significantly activated the MOC reflex, but this was not the case for speech in background noise, which more engaged midbrain and cortical resources. A model of the initial stages of auditory processing reproduced specific effects of each form of speech degradation, providing a rationale for goal-directed gating of the MOC reflex based on enhancing the representation of the energy envelope of the acoustic waveform. Our data reveal the coexistence of 2 strategies in the auditory system that may facilitate speech understanding in situations where the signal is either intrinsically degraded or masked by extrinsic acoustic energy. Whereas intrinsically degraded streams recruit the MOC reflex to improve representation of speech cues peripherally, extrinsically masked streams rely more on higher auditory centres to denoise signals.
“…In contrast to active listening, where participants were asked to ignore the auditory stimuli and direct attention to a silent film, the MOC reflex was gated in a direction consistent with the auditory system suppressing irrelevant and expected auditory information whilst (presumably) attending to visual streams [52–54]. Interestingly, activation of the MOC reflex was observed for natural speech—further evidence that activation is not limited to tones and broadband noise [55–57]—and did not depend on whether participants were required to attention (i.e: were engaged in a lexical-decision task) or not. This can be explained by natural speech being particularly salient as an undegraded, ethologically-relevant stimulus and the low attentional load of passively watching a film resulting in the continued monitoring of unattended speech [58].…”
14Navigating "cocktail party" situations by enhancing foreground sounds over irrelevant background 15 information is typically considered from a cortico-centric perspective. However, subcortical circuits, 16 such as the medial olivocochlear reflex (MOCR) that modulates inner ear activity itself, have ample 17 opportunity to extract salient features from the auditory scene prior to any cortical processing. To 18 understand the contribution of auditory subcortical nuclei and the cochleae, physiological 19 recordings were made along the auditory pathway while listeners differentiated non(sense)-words 20 and words. Both naturally-spoken and intrinsically-noisy, vocoded speechfiltering that mimics 21 processing by a cochlear implant-significantly activated the MOCR, whereas listening to speech-22 in-background noise revealed instead engagement of midbrain and cortical resources. An auditory 23 periphery model reproduced these speech degradation-specific effects, providing a rationale for 24 goal-directed MOCR gating to enhance representation of speech features in the auditory nerve.
25These results highlight two strategies co-existing in the auditory system to accommodate 26 categorically different speech degradations. 27 78 cortical auditory responses in both the active listening task and when listeners were required to 79 ignore the task and watch a silent, non-subtitled film.
80Maintaining a fixed task difficulty across speech manipulations, we found measures of hearing 81 function at the level of the cochlea, brainstem, midbrain and cortex to be modulated differently 82 depending on the degradation type applied to speech sounds, and on whether or not speech was 83 3 actively attended. Specifically, the MOCR, assessed in terms of the magnitude of click-evoked 84 OAEs (CEOAEs), was activated by vocoded speech-an intrinsically degraded speech signal-but 85 not by otherwise 'clean' speech presented in either babble-noise or speech-shaped noise. Neural 86 activity at the first synaptic stage of central processing in the cochlear nucleus (CN)-assessed 87 physiologically through auditory brainstem responses (ABRs)-confirmed the reduction in cochlear 88 gain for actively attended vocoded speech, but not speech-in-noise. Conversely, neural activity 89 generated by the auditory midbrain was significantly increased in active vs. passive listening for 90 speech in babble and speech-shaped noise, but not for vocoded speech. This increase was 91 associated with elevated cortical markers of listening effort for the speech-in-noise conditions. A 92 model of the auditory periphery including an MOC circuit with biophysically-realistic temporal 93 dynamics confirmed the stimulus-dependent role of the MOCR in enhancing neural coding of 94 speech signals. Our data suggest that otherwise identical performance in active listening tasks may 95 invoke quite different efferent circuits, requiring different levels of listening effort, depending on the 96 type of stimulus degradation experienced.
98
Results
99Iso-performance in three manipul...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.