Differences in voice pitch (F0) and vocal tract length (VTL) improve intelligibility of speech masked by a background talker (speech-on-speech; SoS) for normal-hearing (NH) listeners. Cochlear implant (CI) users, who are less sensitive to these two voice cues compared to NH listeners, experience difficulties in SoS perception. Three research questions were addressed: (1) whether increasing the F0 and VTL difference (DF0; DVTL) between two competing talkers benefits CI users in SoS intelligibility and comprehension, (2) whether this benefit is related to their F0 and VTL sensitivity, and (3) whether their overall SoS intelligibility and comprehension are related to their F0 and VTL sensitivity. Results showed: (1) CI users did not benefit in SoS perception from increasing DF0 and DVTL; increasing DVTL had a slightly detrimental effect on SoS intelligibility and comprehension. Results also showed: (2) the effect from increasing DF0 on SoS intelligibility was correlated with F0 sensitivity, while the effect from increasing DVTL on SoS comprehension was correlated with VTL sensitivity. Finally, (3) the sensitivity to both F0 and VTL, and not only one of them, was found to be correlated with overall SoS performance, elucidating important aspects of voice perception that should be optimized through future coding strategies.
Objectives: Speech intelligibility in the presence of a competing talker (speech-on-speech; SoS) presents more difficulties for cochlear implant (CI) users compared with normal-hearing listeners. A recent study implied that these difficulties may be related to CI users’ low sensitivity to two fundamental voice cues, namely, the fundamental frequency (F0) and the vocal tract length (VTL) of the speaker. Because of the limited spectral resolution in the implant, important spectral cues carrying F0 and VTL information are expected to be distorted. This study aims to address two questions: (1) whether spectral contrast enhancement (SCE), previously shown to enhance CI users’ speech intelligibility in the presence of steady state background noise, could also improve CI users’ SoS intelligibility, and (2) whether such improvements in SoS from SCE processing are due to enhancements in CI users’ sensitivity to F0 and VTL differences between the competing talkers. Design: The effect of SCE on SoS intelligibility and comprehension was measured in two separate tasks in a sample of 14 CI users with Cochlear devices. In the first task, the CI users were asked to repeat the sentence spoken by the target speaker in the presence of a single competing talker. The competing talker was the same target speaker whose F0 and VTL were parametrically manipulated to obtain the different experimental conditions. SoS intelligibility, in terms of the percentage of correctly repeated words from the target sentence, was assessed using the standard advanced combination encoder (ACE) strategy and SCE for each voice condition. In the second task, SoS comprehension accuracy and response times were measured using the same experimental setup as in the first task, but with a different corpus. In the final task, CI users’ sensitivity to F0 and VTL differences were measured for the ACE and SCE strategies. The benefit in F0 and VTL discrimination from SCE processing was evaluated with respect to the improvement in SoS perception from SCE. Results: While SCE demonstrated the potential of improving SoS intelligibility in CI users, this effect appeared to stem from SCE improving the overall signal to noise ratio in SoS rather than improving the sensitivity to the underlying F0 and VTL differences. A second key finding of this study was that, contrary to what has been observed in a previous study for childlike voice manipulations, F0 and VTL manipulations of a reference female speaker (target speaker) toward male-like voices provided a small but significant release from masking for the CI users tested. Conclusions: The present findings, together with those previously reported in the literature, indicate that SCE could serve as a possible background-noise-reduction strategy in commercial CI speech processors that could enhance speech intelligibility especially in the presence of background talkers that have longer VTLs compared with the target speaker.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.