The effect of spatial separation of sources on the masking of a speech signal was investigated for three types of maskers, ranging from energetic to informational. Normal-hearing listeners performed a closed-set speech identification task in the presence of a masker at various signal-to-noise ratios. Stimuli were presented in a quiet sound field. The signal was played from 0 degrees azimuth and a masker was played either from the same location or from 90 degrees to the right. Signals and maskers were derived from sentences that were preprocessed by a modified cochlear-implant simulation program that filtered each sentence into 15 frequency bands, extracted the envelopes from each band, and used these envelopes to modulate pure tones at the center frequencies of the bands. In each trial, the signal was generated by summing together eight randomly selected frequency bands from the preprocessed signal sentence. Three maskers were derived from the preprocessed masker sentences: (1) different-band sentence, which was generated by summing together six randomly selected frequency bands out of the seven bands not present in the signal (resulting in primarily informational masking); (2) different-band noise, which was generated by convolving the different-band sentence with Gaussian noise; and (3) same-band noise, which was generated by summing the same eight bands from the preprocessed masker sentence that were used in the signal sentence and convolving the result with Gaussian noise (resulting in primarily energetic masking). Results revealed that in the different-band sentence masker, the effect of spatial separation averaged 18 dB (at 51% correct), while in the different-band and same-band noise maskers the effect was less than 10 dB. These results suggest that, in these conditions, the advantage due to spatial separation of sources is greater for informational masking than for energetic masking.
Informational masking ͑IM͒ has a long history and is currently receiving considerable attention. Nevertheless, there is no clear and generally accepted picture of how IM should be defined, and once defined, explained. In this letter, consideration is given to the problems of defining IM and specifying research that is needed to better understand and model IM.
Informational masking was reduced using three stimulus presentation schemes that were intended to perceptually segregate the signal from the masker. The maskers were sets of sinusoids chosen randomly in frequency and intensity on each stimulus interval or, in some conditions, on every masker burst in a series of bursts within intervals. Masker components were excluded from the frequency region surrounding the 1000-Hz signal to minimize the energetic masking. Masked thresholds as great as 60–70 dB above quiet threshold were observed for some subjects in some conditions. It was shown that this informational masking could be reduced as much as 40 dB by: (1) presenting the masker to both ears and signal to one ear; (2) playing different masker samples sequentially in each interval of every trial; or (3) presenting the signal in alternate bursts of multiple, identical masker samples. For the binaural manipulation, informational masking was reduced because the masker and signal were perceived as originating from different interaural locations. In the latter two manipulations, a difference in the spectral or temporal pattern of the signal and masker provided the detection cue. These effects were interpreted as evidence of the importance of perceptual segregation of sounds in noisy listening environments where signal reception is not limited by energetic masking.
No abstract
This study investigated the interaction between hearing loss, reverberation, and age on the benefit of spatially separating multiple masking talkers from a target talker. Four listener groups were tested based on hearing status and age. On every trial listeners heard three different sentences spoken simultaneously by different female talkers. Listeners reported keywords from the target sentence, which was presented at a fixed and known location. Maskers were colocated with the target or presented from spatially separated and symmetrically placed loudspeakers, creating a situation with no simple "better-ear." Reverberation was also varied. The target-to-masker ratio at threshold for identification of the fixed-level target was measured by adapting the level of the maskers. On average, listeners with hearing loss showed less spatial release from masking than normal-hearing listeners. Age was a significant factor although small differences in hearing sensitivity across age groups may have contributed to this effect. Spatial release was reduced in the more reverberant room condition but in most cases a significant advantage remained. These results provide evidence for a large benefit of spatial separation in a multitalker situation that is likely due to perceptual factors. However, this benefit is significantly reduced by both hearing loss and reverberation.
Spatial release from masking was studied in a three-talker soundfield listening experiment. The target talker was presented at 0 degrees azimuth and the maskers were either colocated or symmetrically positioned around the target, with a different masker talker on each side. The symmetric placement greatly reduced any "better ear" listening advantage. When the maskers were separated from the target by +/-15 degrees , the average spatial release from masking was 8 dB. Wider separations increased the release to more than 12 dB. This large effect was eliminated when binaural cues and perceived spatial separation were degraded by covering one ear with an earplug and earmuff. Increasing reverberation in the room increased the target-to-masker ratio (TM) for the separated, but not colocated, conditions reducing the release from masking, although a significant advantage of spatial separation remained. Time reversing the masker speech improved performance in both the colocated and spatially separated cases but lowered TM the most for the colocated condition, also resulting in a reduction in the spatial release from masking. Overall, the spatial tuning observed appears to depend on the presence of interaural differences that improve the perceptual segregation of sources and facilitate the focus of attention at a point in space.
This study examined the role of focused attention along the spatial (azimuthal) dimension in a highly uncertain multitalker listening situation. The task of the listener was to identify key words from a target talker in the presence of two other talkers simultaneously uttering similar sentences. When the listener had no a priori knowledge about target location, or which of the three sentences was the target sentence, performance was relatively poor-near the value expected simply from choosing to focus attention on only one of the three locations. When the target sentence was cued before the trial, but location was uncertain, performance improved significantly relative to the uncued case. When spatial location information was provided before the trial, performance improved significantly for both cued and uncued conditions. If the location of the target was certain, proportion correct identification performance was higher than 0.9 independent of whether the target was cued beforehand. In contrast to studies in which known versus unknown spatial locations were compared for relatively simple stimuli and tasks, the results of the current experiments suggest that the focus of attention along the spatial dimension can play a very significant role in solving the "cocktail party" problem.
The ability to understand speech in a multi-source environment containing informational masking may depend on the perceptual arrangement of signal and masker objects in space. In normal-hearing listeners, Arbogast et al. [J. Acoust. Soc. Am. 112, 2086-2098 (2002)] found an 18-dB spatial release from a primarily informational masker, compared to 7 dB for a primarily energetic masker. This article extends the earlier work to include the study of listeners with sensorineural hearing loss. Listeners performed closed-set speech recognition in two spatial conditions: 0 degrees and 90 degrees separation between signal and masker. Three maskers were tested: (1) the different-band sentence masker was designed to be primarily informational; (2) the different-band noise masker was a control for the different-band sentence; and (3) the same-band noise masker was designed to be primarily energetic. The spatial release from the different-band sentence was larger than for the other maskers, but was smaller (10 dB) for the hearing-impaired group than for the normal-hearing group (15 dB). The smaller benefit for the hearing-impaired listeners can be partially explained by masker sensation level. However, the results suggest that hearing-impaired listeners can use the perceptual effect of spatial separation to improve speech recognition in the presence of a primarily informational masker.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.