2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2021
DOI: 10.1109/asru51503.2021.9688101
|View full text |Cite
|
Sign up to set email alerts
|

On Addressing Practical Challenges for RNN-Transducer

Abstract: In this paper, several works are proposed to address practical challenges for deploying RNN Transducer (RNN-T) based speech recognition system. These challenges are adapting a well-trained RNN-T model to a new domain without collecting the audio data, obtaining time stamps and confidence scores at word level. The first challenge is solved with a splicing data method which concatenates the speech segments extracted from the source domain data. To get the time stamp, a phone prediction branch is added to the RNN… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 13 publications
(1 citation statement)
references
References 37 publications
0
1
0
Order By: Relevance
“…To effectively adapt large-scale pre-trained English ASR models to different language recognition (e.g., English to French), previous research efforts [30] suggest a solution by replacing the last prediction layer of ASR. However, we investigate that deploying "multilingual graphemes" is even more effective than replacing the final prediction head directly.…”
Section: English Graphemes Pre-training For Multilingual Datamentioning
confidence: 99%
“…To effectively adapt large-scale pre-trained English ASR models to different language recognition (e.g., English to French), previous research efforts [30] suggest a solution by replacing the last prediction layer of ASR. However, we investigate that deploying "multilingual graphemes" is even more effective than replacing the final prediction head directly.…”
Section: English Graphemes Pre-training For Multilingual Datamentioning
confidence: 99%