Selective auditory attention is essential for human listeners to be able to communicate in multi-source environments. Selective attention is known to modulate the neural representation of the auditory scene, boosting the representation of a target sound relative to the background, but the strength of this modulation, and the mechanisms contributing to it, are not well understood. Here, listeners performed a behavioral experiment demanding sustained, focused spatial auditory attention while we measured cortical responses using electroencephalography (EEG). We presented three concurrent melodic streams; listeners were asked to attend and analyze the melodic contour of one of the streams, randomly selected from trial to trial. In a control task, listeners heard the same sound mixtures, but performed the contour judgment task on a series of visual arrows, ignoring all auditory streams. We found that the cortical responses could be fit as weighted sum of event-related potentials evoked by the stimulus onsets in the competing streams. The weighting to a given stream was roughly 10 dB higher when it was attended compared to when another auditory stream was attended; during the visual task, the auditory gains were intermediate. We then used a template-matching classification scheme to classify single-trial EEG results. We found that in all subjects, we could determine which stream the subject was attending significantly better than by chance. By directly quantifying the effect of selective attention on auditory cortical responses, these results reveal that focused auditory attention both suppresses the response to an unattended stream and enhances the response to an attended stream. The single-trial classification results add to the growing body of literature suggesting that auditory attentional modulation is sufficiently robust that it could be used as a control mechanism in brain–computer interfaces (BCIs).
Visual cues are known to aid auditory processing when they provide direct information about signal content, as in lip reading. However, some studies hint that visual cues also aid auditory perception by guiding attention to the target in a mixture of similar sounds. The current study directly tests this idea for complex, nonspeech auditory signals, using a visual cue providing only timing information about the target. Listeners were asked to identify a target zebra finch bird song played at a random time within a longer, competing masker. Two different maskers were used: noise and a chorus of competing bird songs. On half of all trials, a visual cue indicated the timing of the target within the masker. For the noise masker, the visual cue did not affect performance when target and masker were from the same location, but improved performance when target and masker were in different locations. In contrast, for the chorus masker, visual cues improved performance only when target and masker were perceived as coming from the same direction. These results suggest that simple visual cues for when to listen improve target identification by enhancing sounds near the threshold of audibility when the target is energetically masked and by enhancing segregation when it is difficult to direct selective attention to the target. Visual cues help little when target and masker already differ in attributes that enable listeners to engage selective auditory attention effectively, including differences in spectrotemporal structure and in perceived location.
The ability to direct and redirect selective auditory attention varies substantially across individuals with normal hearing thresholds, even when sounds are clearly audible. We hypothesized that these differences can come from both differences in the spectrotemporal fidelity of subcortical sound representations and in the efficacy of cortical attentional networks that modulate neural representations of the auditory scene. Here, subjects were presented with an initial stream from straight ahead and a second stream (from either left or right), each comprised of four monotonized consonant-vowel syllables. Listeners were instructed to report the contents of either the first stream (holding attentional focus) or the second stream (switching attentional focus). Critically, the direction of the second stream informed subjects whether to hold or to switch attention. Pilot results suggest that when the lateral angle of the second stream is small, task performance is linked to subcortical encoding fidelity of suprathreshold sound (as measured using brainstem frequency-following responses obtained separately in the same subjects). Using a paradigm that allows simultaneous collection of behavioral measures, FFRs, and cortical responses, here we test whether differences in top-down attentional control explain subject variability when the second stream’s lateral angle is large and coding fidelity does not limit performance.
We used a specially designed computer game to examine behavioral consequences of audiovisual integration. Target stimuli (animated fish swimming across the computer screen) were modulated in size and/or emitted an amplitude-modulated sound. Modulations, visual or auditory, were at 6 or 7 Hz (corresponding to “slow” and “fast”). In one game, subjects were instructed to categorize successive fish as “slow” or “fast” based on the auditory modulations; in another game, they categorized fish based on visual modulation rate. In each game, subjects were instructed to ignore input from the task-irrelevant modality. In each game, modulations could be (1) present only in the modality of interest, (2) present and matching in both modalities, or (3) present but mismatched between modalities. While reaction times were similar across games, accuracy was highest when auditory modulation was the basis for categorizing fish. Accuracy and reaction times improved when cross-modal modulation rates matched, and worsened when modulation rates conflicted. Additionally, accuracy was more strongly affected by between-modality congruence/incongruence when subjects attended to visual modulations than when they attended to auditory ones. Results indicate that audiovisual integration is not entirely under volitional control, and that competition between sensory modalities adversely impacts perception in dynamic environments.
A miniature accelerometer and microphone can be used to obtain Horii Oral-Nasal Coupling (HONC) scores to objectively measure nasalization of speech. While this instrumentation compares favorably in terms of size and cost relative to other objective measures of nasality, the metric has not been well characterized in children. Furthermore, the measure is known to be affected by vowel loading, as speech loaded with "high" vowels is consistently scored as more nasal than speech loaded with "low" vowels. Filtering the signals used in computation of the HONC score to better isolate the correlates of nasalization has been shown to reduce vowel-related effects on the metric, but the efficacy of filtering has thus far only been explored in adults. Here, HONC scores for running speech and the vowel portions of consonant-vowel-consonant tokens were calculated for the speech of 26 children, aged 4-9 yrs. Scores were computed using the broadband accelerometer and speech signals, as well as using filtered, low-frequency versions of these signals. HONC scores obtained using both broadband and filtered signals resulted in well-separated scores for nasal and non-nasal speech. HONC scores computed using filtered signals were found to exhibit less withinparticipant variability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.