The Speech Transmission Index (STI) is a physical metric that is well correlated with the intelligibility of speech degraded by additive noise and reverberation. The traditional STI uses modulated noise as a probe signal and is valid for assessing degradations that result from linear operations on the speech signal. Researchers have attempted to extend the STI to predict the intelligibility of nonlinearly processed speech by proposing variations that use speech as a probe signal. This work considers four previously proposed speech-based STI methods and four novel methods, studied under conditions of additive noise, reverberation, and two nonlinear operations (envelope thresholding and spectral subtraction). Analyzing intermediate metrics in the STI calculation reveals why some methods fail for nonlinear operations. Results indicate that none of the previously proposed methods is adequate for all of the conditions considered, while four proposed methods produce qualitatively reasonable results and warrant further study. The discussion considers the relevance of this work to predicting the intelligibility of cochlear-implant processed speech.
The purpose of this study was to determine the extent to which cochlear implant (CI) rate discrimination can be improved through training. Six adult CI users took part in a study that included 32 h of training and assessment on rate discrimination measures. Rate difference limens (DLs) were measured from 110 to 3520 Hz in octave steps using 500 ms biphasic pulse trains; the target and standard stimuli were loudness-balanced with the target always at an adaptively lower rate. DLs were measured at four electrode positions corresponding to basal, mid-basal, mid-apical, and apical locations. Procedural variations were implemented to determine if rate discrimination was impacted by random variations in stimulus amplitude or by amplitude modulation. DLs improved by more than a factor of 2 across subjects, electrodes, and standard rates. Factor analysis indicated that the effect of training was comparable for all electrodes and standard rates tested. Neither level roving nor amplitude modulation had a significant effect on rate DLs. In conclusion, the results demonstrate that training can significantly improve CI rate discrimination on a psychophysical task.
This study examined correlations between pitch and phoneme perception for nine cochlear implant users and nine normal hearing listeners. Pure tone frequency discrimination thresholds were measured for frequencies of 500, 1000, and 2000 Hz. Complex tone fundamental frequency (F0) discrimination thresholds were measured for F0s of 110, 220, and 440 Hz. The effects of amplitude and frequency roving were measured under the rationale that individuals who are robust to such perturbations would perform better on phoneme perception measures. Phoneme identification was measured using consonant and vowel materials in quiet, in stationary speech-shaped noise (SSN), in spectrally notched SSN, and in temporally gated SSN. Cochlear implant pure tone frequency discrimination thresholds ranged between 1.5 and 9.9 %, while cochlear implant complex tone F0 discrimination thresholds ranged between 2.6 and 28.5 %. On average, cochlear implant users had 5.3 dB of masking release for consonants and 8.4 dB of masking release for vowels when measured in temporally gated SSN compared to stationary SSN. Correlations with phoneme identification measures were generally higher for complex tone discrimination measures than for pure tone discrimination measures. Correlations with phoneme identification measures were also generally higher for pitch perception measures that included amplitude and frequency roving. The strongest correlations were observed for measures of complex tone F0 discrimination with phoneme identification in temporally gated SSN. The results of this study suggest that musical training or signal processing strategies that improve F0 discrimination should improve consonant identification in fluctuating noise.
The purpose of this study is to identify precise and repeatable measures for assessing cochlear-implant (CI) hearing. The study presents psychoacoustic and phoneme identification measures in CI and normal-hearing (NH) listeners, with correlations between measures examined. Psychoacoustic measures included pitch discrimination tasks using pure tones, harmonic complexes, and tone pips; intensity perception tasks included intensity discrimination for tones and modulation detection; spectral-temporal masking tasks included gap detection, forward and backward masking, tone-on-tone masking, synthetic formant-on-formant masking, and tone in noise detection. Phoneme perception measures included vowel and consonant identification in quiet and stationary and temporally gated speech-shaped noise. Results on psychoacoustic measures illustrate the effects of broader filtering in CI hearing contributing to reduced pitch perception and increased spectral masking. Results on consonant and vowel identification measures illustrate a wide range in performance across CI listeners. They also provide further evidence that CI listeners obtain little to no release of masking in temporally gated noise compared to stationary noise. The forward and backward-masking measures had the highest correlation with the phoneme identification measures for CI listeners. No significant correlations between speech reception and psychoacoustic measures were observed for NH listeners. The superior NH performance on measures of phoneme identification, especially in the presence of background noise, is a key difference between groups.
Cochlear implant (CI) users find it extremely difficult to discriminate between talkers, which may partially explain why they struggle to understand speech in a multi-talker environment. Recent studies, based on findings with postlingually deafened CI users, suggest that these difficulties may stem from their limited use of vocal-tract length (VTL) cues due to the degraded spectral resolution transmitted by the CI device. The aim of the present study was to assess the ability of adult CI users who had no prior acoustic experience, i.e., prelingually deafened adults, to discriminate between resynthesized "talkers" based on either fundamental frequency (F0) cues, VTL cues, or both. Performance was compared to individuals with normal hearing (NH), listening either to degraded stimuli, using a noise-excited channel vocoder, or non-degraded stimuli. Results show that (a) age of implantation was associated with VTL but not F0 cues in discriminating between talkers, with improved discrimination for those subjects who were implanted at earlier age; (b) there was a positive relationship for the CI users between VTL discrimination and speech recognition score in quiet and in noise, but not with frequency discrimination or cognitive abilities; (c) early-implanted CI users showed similar voice discrimination ability as the NH adults who listened to vocoded stimuli. These data support the notion that voice discrimination is limited by the speech processing of the CI device. However, they also suggest that early implantation may facilitate sensory-driven tonotopicity and/or improve higher-order auditory functions, enabling better perception of VTL spectral cues for voice discrimination.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.