Speech recognition was measured in three groups of listeners: those with sensorineural hearing loss of (presumably) cochlear origin (HL), those with normal hearing (NH), and those with normal hearing who listened in the presence of a spectrally shaped noise that elevated their pure-tone thresholds to match those of individual listeners in the HL group (NM). Performance was measured in four backgrounds that differed only in their temporal envelope: steady-state (SS) speech-shaped noise, speech-shaped noise modulated by the envelope of multi-talker babble (MT), speech-shaped noise modulated by the envelope of single-talker speech (ST), and speech-shaped noise modulated by a 10-Hz square wave (SQ). Threshold signal-to-noise ratios (SNRs) were typically best in the ST and especially the SQ conditions, indicating a masking release in those modulated backgrounds. SNRs in the SS and MT conditions were essentially identical to one another. The masking release was largest in the listeners in the NH group, and it tended to decrease as hearing loss increased. In 5 of the 11 listeners in the HL group, the masking release was nearly identical to that obtained in the NM group matched to those listeners; in the other 6 listeners, the release was smaller than that in the NM group. The reduced masking release was simulated best in those HL listeners for whom the masking release was relatively large. These results suggest that reduced masking release for speech in listeners with sensorineural hearing loss can only sometimes be accounted for entirely by reduced audibility.
Temporal processing of suprathreshold sounds was examined in a group of young normalhearing subjects (mean age of 26.0 years), and in three groups of older subjects (mean ages of 54.3, 64.8, and 72.2 years) with normal hearing or mild sensorineural hearing loss. Three experiments were performed. In the first experiment (modulation detection), subjects were asked to detect sinusoidal amplitude modulation (SAM) of a broadband noise, for modulation frequencies ranging from 2–1024 Hz. In the second experiment (modulation masking), the task was to detect a SAM signal (modulation frequency of 8 Hz) in the presence of a 100%-modulated SAM masker. Masker modulation frequency ranged from 2–64 Hz. In the final experiment, speech understanding was measured as a function of signal-to-noise ratio in both an unmodulated background noise and in a SAM background noise that had a modulation frequency of 8 Hz and a modulation depth of 100%. Except for a very modest correlation between age and modulation detection sensitivity at low modulation frequencies, there were no significant effects of age once the effect of hearing loss was taken into account. The results of the experiments suggest, however, that subjects with even a mild sensorineural hearing loss may have difficulty with a modulation masking task, and may not understand speech as well as normal-hearing subjects do in a modulated noise background.
The forward masking of a sinusoidal signal by a sinusoid of the same frequency was investigated for frequencies ranging from 125 to 4000 Hz. Forward masking in dB is proportional to both masker level and log signal delay at each frequency. More forward masking occurs at very low frequencies than at high frequencies, given equal-sensation-level maskers, and masked thresholds are greater at low frequencies than at high frequencies given equal-SPL maskers. The data can be described equally well by assuming that the difference in forward masking as a function of frequency is due to a change in the time course of recovery from masking or to a change in the growth of masking at each signal delay. The frequency effect is not large enough to change the interpretation of forward-masking data in studies of suppression or psychophysical tuning curves.
The ability of a given target component in certain spectral complexes can be considerably increased by exposure to the complex with the target component deleted. This "enhancement effect" can be observed under a wide variety of conditions and presumably reflects frequency-specific adaptation: the frequency region around the target frequency is not adapted during the exposure and hence is relatively more sensitive. Data from the present study indicate that an enhanced component in a harmonic complex produces more forward masking of a sinusoidal probe than when that component is not enhanced, i.e., an enhanced component behaves as if it were physically more intense. This suggests that the adaptation process underlying the enhancement effect produces an increase in gain in the unadapted frequency region. This increase might result from a decrease, due to adaptation, of suppression of the unadapted region.
Modulation thresholds were measured for a sinusoidally amplitude-modulated (SAM) broadband noise in the presence of a SAM broadband background noise with a modulation depth (mm) of 0.00, 0.25, or 0.50, where the condition mm = 0.00 corresponds to standard (unmasked) modulation detection. The modulation frequency of the masker was 4, 16, or 64 Hz; the modulation frequency of the signal ranged from 2-512 Hz. The greatest amount of modulation masking (masked threshold minus unmasked threshold) typically occurred when the signal frequency was near the masker frequency. The modulation masking patterns (amount of modulation masking versus signal frequency) for the 4-Hz masker were low pass, whereas the patterns for the 16- and 64-Hz maskers were somewhat bandpass (although not strictly so). In general, the greater the modulation depth of the masker, the greater the amount of modulation masking (although this trend was reversed for the 4-Hz masker at high signal frequencies). These modulation-masking data suggest that there are channels in the auditory system which are tuned for the detection of modulation frequency, much like there are channels (critical bands or auditory filters) tuned for the detection of spectral frequency.
The addition of low-frequency acoustic information to real or simulated electric stimulation ͑so-called electric-acoustic stimulation or EAS͒ often results in large improvements in intelligibility, particularly in competing backgrounds. This may reflect the availability of fundamental frequency ͑F0͒ information in the acoustic region. The contributions of F0 and the amplitude envelope ͑as well as voicing͒ of speech to simulated EAS was examined by replacing the low-frequency speech with a tone that was modulated in frequency to track the F0 of the speech, in amplitude with the envelope of the low-frequency speech, or both. A four-channel vocoder simulated electric hearing. Significant benefit over vocoder alone was observed with the addition of a tone carrying F0 or envelope cues, and both cues combined typically provided significantly more benefit than either alone. The intelligibility improvement over vocoder was between 24 and 57 percentage points, and was unaffected by the presence of a tone carrying these cues from a background talker. These results confirm the importance of the F0 of target speech for EAS ͑in simulation͒. They indicate that significant benefit can be provided by a tone carrying F0 and amplitude envelope cues. The results support a glimpsing account of EAS and argue against segregation.
In this study, auditory stream segregation based on differences in the rate of envelope fluctuations--in the absence of spectral and temporal fine structure cues--was tested. The temporal sequences to segregate were composed of fully amplitude-modulated (AM) bursts of broadband noises A and B. All sequences were built by the reiteration of a ABA triplet where A modulation rate was fixed at 100 Hz and B modulation rate was variable. The first experiment was devoted to measuring the threshold difference in AM rate leading subjects to perceive the sequence as two streams as opposed to just one. The results of this first experiment revealed that subjects generally perceived the sequences as a single perceptual stream when the difference in AM rate between the A and B noises was smaller than 0.75 oct, and as two streams when the difference was larger than about 1.00 oct. These streaming thresholds were found to be substantially larger than, and not related to, the subjects' modulation-rate discrimination thresholds. The results of a second experiment demonstrated that AM-rate-based streaming was adversely affected by decreases in AM depth, but that segregation remained possible as long as the AM of either the A or B noises was above the subject's AM-detection threshold. The results of a third experiment indicated that AM-rate-based streaming effects were still observed when the modulations applied to the A and B noises were set individually, either at a constant level in dB above AM-detection threshold, or at levels at which they were of the same perceived strength. This finding suggests that AM-rate-based streaming is not necessarily mediated by perceived differences in AM depth. Altogether, the results of this study indicate that sequential sounds can be segregated on the sole basis of differences in the rate of their temporal fluctuations in the absence of other temporal or spectral cues.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.