ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054450
|View full text |Cite
|
Sign up to set email alerts
|

Using Automatic Speech Recognition and Speech Synthesis to Improve the Intelligibility of Cochlear Implant users in Reverberant Listening Environments

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…Automatic speech recognition (ASR) has a long history of research (Bahl et al, 1983;Hinton et al, 2012;Chu et al, 2020). By audio signal processing and modeling, speech contents can be transcribed into texts for various applications (Yu and Deng, 2016;Yang et al, 2021).…”
Section: Introductionmentioning
confidence: 99%
“…Automatic speech recognition (ASR) has a long history of research (Bahl et al, 1983;Hinton et al, 2012;Chu et al, 2020). By audio signal processing and modeling, speech contents can be transcribed into texts for various applications (Yu and Deng, 2016;Yang et al, 2021).…”
Section: Introductionmentioning
confidence: 99%
“…Automatic speech recognition (ASR) systems have experienced substantial improvements in recognizing speech in reverberant environments [7]. We [8] and others [9] implemented a strategy in CIs that leverages ASR to translate reverberant speech into an estimated text sequence and uses speech synthesis to generate anechoic speech from the predicted text. This ASR speech synthesis strategy substantially improved reverberant speech intelligibility in CI users [9] and in normal hearing listeners using vocoded speech [8].…”
Section: Introductionmentioning
confidence: 99%
“…We [8] and others [9] implemented a strategy in CIs that leverages ASR to translate reverberant speech into an estimated text sequence and uses speech synthesis to generate anechoic speech from the predicted text. This ASR speech synthesis strategy substantially improved reverberant speech intelligibility in CI users [9] and in normal hearing listeners using vocoded speech [8]. However, the ASR speech synthesis strategy is not real-time feasible in a CI processor because it imposes a processing delay that exceeds the maximum audio-visual delay that CI users can tolerate, which is about 260ms [10].…”
Section: Introductionmentioning
confidence: 99%