Joshua M. Alexander scite author profile

Objective The primary goal of nonlinear frequency compression (NFC) and other frequency lowering strategies is to increase the audibility of high-frequency sounds that are not otherwise audible with conventional hearing-aid processing due to the degree of hearing loss, limited hearing aid bandwidth or a combination of both factors. The aim of the current study was to compare estimates of speech audibility processed by NFC to improvements in speech recognition for a group of children and adults with high-frequency hearing loss. Design Monosyllabic word recognition was measured in noise for twenty-four adults and twelve children with mild to severe sensorineural hearing loss. Stimuli were amplified based on each listener’s audiogram with conventional processing (CP) with amplitude compression or with NFC and presented under headphones using a software-based hearing aid simulator. A modification of the speech intelligibility index (SII) was used to estimate audibility of information in frequency-lowered bands. The mean improvement in SII was compared to the mean improvement in speech recognition. Results All but two listeners experienced improvements in speech recognition with NFC compared to CP, consistent with the small increase in audibility that was estimated using the modification of the SII. Children and adults had similar improvements in speech recognition with NFC. Conclusion Word recognition with NFC was higher than CP for children and adults with mild to severe hearing loss. The average improvement in speech recognition with NFC (7%) was consistent with the modified SII, which indicated that listeners experienced an increase in audibility with NFC compared to CP. Further studies are necessary to determine if changes in audibility with NFC are related to speech recognition with NFC for listeners with greater degrees of hearing loss, with a greater variety of compression settings, and using auditory training.

show abstract

Informational masking in hearing-impaired and normal-hearing listeners: Sensation level and decision weights

Alexander

Lutfi

2004

View full text Add to dashboard Cite

Informational masking (IM) refers to elevations in signal threshold caused by masker uncertainty. The purpose of this study was to investigate two factors expected to influence IM in hearing-impaired listeners. Masked thresholds for a 2000-Hz signal in the presence of simultaneous multitone maskers were measured in 16 normal-hearing (NH) and 9 hearing-impaired (HI) listeners. The maskers were 70 dB SPL average total power and were comprised of fixed-frequency components between 522 and 8346 Hz that were separated from each other by at least 1/3 oct and from the signal by at least 2/3 octs. Masker uncertainty was manipulated by randomly presenting each masker component with probability p = 0.1,0.2,...,0.9, or 1.0 across different trial blocks. Energetic masking was estimated as the amount of masking for p = 1.0, where masker uncertainty was minimum. IM was estimated as the amount of masking in excess of energetic masking. Decision weights were estimated by a regression of the listener's yes/no responses against the presence or absence of the signal and masker components. The decision weights and sensation levels (SLs) of the stimulus components were incorporated as factors in a model that predicts individual differences in IM based on the level variance (in dB) at the output of independent auditory filters [Lutfi, J. Acoust. Soc. Am. 94, 748-758 (1993)]. The results showed much individual variability in IM for the NH listeners (over 40 dB), but little IM for most HI listeners. When masker components were presented to a group of NH listeners at SLs similar to the HI listeners, IM was also similar to the HI listeners. IM was also similar for both groups when the level per masker component was 10 dB SL. These results suggest that reduced masker SLs for HI listeners decrease IM by effectively reducing masker variance. Weighting efficiencies, computed by comparing each listener's pattern of weights to that of an ideal analytic listener, were a good predictor of individual differences in IM among the NH listeners. For the HI listeners weighting efficiency and IM were unrelated because of the large variation in masker SLs among individual listeners, the small variance in IM, and perhaps because broadened auditory filters in some listeners increased the covariance in auditory filter outputs.

show abstract

Auditory color constancy: Calibration to reliable spectral properties across nonspeech context and targets

Stilp

Alexander

Kiefte

et al. 2010

Attention, Perception, & Psychophysics

View full text Add to dashboard Cite

Brief experience with reliable spectral characteristics of a listening context can markedly alter perception of subsequent speech sounds, and parallels have been drawn between auditory compensation for listening context and visual color constancy. In order to better evaluate such an analogy, the generality of acoustic context effects for sounds with spectral-temporal compositions distinct from speech was investigated. Listeners identified nonspeech sounds-extensively edited samples produced by a French horn and a tenor saxophone-following either resynthesized speech or a short passage of music. Preceding contexts were "colored" by spectral envelope difference filters, which were created to emphasize differences between French horn and saxophone spectra. Listeners were more likely to report hearing a saxophone when the stimulus followed a context filtered to emphasize spectral characteristics of the French horn, and vice versa. Despite clear changes in apparent acoustic source, the auditory system calibrated to relatively predictable spectral characteristics of filtered context, differentially affecting perception of subsequent target nonspeech sounds. This calibration to listening context and relative indifference to acoustic sources operates much like visual color constancy, for which reliable properties of the spectrum of illumination are factored out of perception of color.

show abstract

Effects of WDRC Release Time and Number of Channels on Output SNR and Speech Recognition

Alexander

Masterson

2015

View full text Add to dashboard Cite

Objectives The purpose of this study was to investigate the joint effects that wide dynamic range compression (WDRC) release time (RT) and number of channels have on recognition of sentences in the presence of steady and modulated maskers at different signal-to-noise ratios (SNRs). How the different combinations of WDRC parameters affect output SNR and the role this plays in the observed findings was also investigated. Design Twenty-four listeners with mild to moderate sensorineural hearing loss identified sentences mixed with steady or modulated maskers at 3 SNRs (−5, 0, +5 dB) that had been processed using a hearing aid simulator with 6 combinations of RT (40 and 640 ms) and number of channels (4, 8, and 16). Compression parameters were set using the Desired Sensation Level v5.0a prescriptive fitting method. For each condition, amplified speech and masker levels and the resultant long-term output SNR were measured. Results Speech recognition with WDRC depended on the combination of RT and number of channels, with the greatest effects observed at 0 dB input SNR, in which mean speech recognition scores varied by 10–12% across WDRC manipulations. Overall, effect sizes were generally small. Across both masker types and the three SNRs tested, the best speech recognition was obtained with 8 channels, regardless of RT. Increased speech levels, which favor audibility, were associated with the short RT and with an increase in the number of channels. These same conditions also increased masker levels by an even greater amount, for a net decrease in the long-term output SNR. Changes in long-term SNR across WDRC conditions were found to be strongly associated with changes in the temporal envelope shape as quantified by the Envelope Difference Index, however, neither of these factors fully explained the observed differences in speech recognition. Conclusions A primary finding of this study was that the number of channels had a modest effect when analyzed at each level of RT, with results suggesting that selecting 8 channels for a given RT might be the safest choice. Effects were smaller for RT, with results suggesting that short RT was slightly better when only 4 channels were used and that long RT was better when 16 channels were used. Individual differences in how listeners were influenced by audibility, output SNR, temporal distortion, and spectral distortion may have contributed to the size of the effects found in this study. Because only general suppositions could made for how each of these factors may have influenced the overall results of this study, future research would benefit from exploring the predictive value of these and other factors in selecting the processing parameters that maximize speech recognition for individuals.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.