2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2019
DOI: 10.1109/asru46091.2019.9003775
|View full text |Cite
|
Sign up to set email alerts
|

Personalization of End-to-End Speech Recognition on Mobile Devices for Named Entities

Abstract: We study the effectiveness of several techniques to personalize end-to-end speech models and improve the recognition of proper names relevant to the user. These techniques differ in the amounts of user effort required to provide supervision, and are evaluated on how they impact speech recognition performance. We propose using keyword-dependent precision and recall metrics to measure vocabulary acquisition performance. We evaluate the algorithms on a dataset that we designed to contain names of persons that are… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
28
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 45 publications
(28 citation statements)
references
References 24 publications
(30 reference statements)
0
28
0
Order By: Relevance
“…Khodak et al [270] and Jiang et al [250] explore the connection between FL and MAML, and show how the MAML setting is a relevant framework to model the personalization objectives for FL. Chai Sim et al [102] applied local fine tuning to personalize speech recognition models in federated learning. Fallah et al [181] developed a new algorithm called Personalized FedAvg by connecting MAML instead of Reptile to federated learning.…”
Section: Local Fine Tuning and Meta-learningmentioning
confidence: 99%
“…Khodak et al [270] and Jiang et al [250] explore the connection between FL and MAML, and show how the MAML setting is a relevant framework to model the personalization objectives for FL. Chai Sim et al [102] applied local fine tuning to personalize speech recognition models in federated learning. Fallah et al [181] developed a new algorithm called Personalized FedAvg by connecting MAML instead of Reptile to federated learning.…”
Section: Local Fine Tuning and Meta-learningmentioning
confidence: 99%
“…The models were trained using the efficient implementation [13] in TensorFlow [14]. We measured the success of the modified model using the word error rate (WER) metric as well as the name recall rate [4] as described below:…”
Section: Resultsmentioning
confidence: 99%
“…Because the size of E2E models is much smaller than that of hybrid models, E2E models have clear advantages when being deployed to device. Therefore, personalization or adaptation of E2E models [119], [120], [126], [127] is a rapidly growing area. While it possible to adapt every user's model on cloud and then push it back to device, it is more reasonable to adapt the model on device, which needs to adjust the adaptation algorithm to overcome the challenge of limited memory and computation power [119].…”
Section: Summary and Discussionmentioning
confidence: 99%
“…Because AED and RNN-T also have components corresponding to the language model, there are also techniques specific to adapting the language modeling aspect of E2E models, for instance using a text embedding instead of an acoustic embedding to bias an E2E model in order to produce outputs relevant to the particular recognition context [123]- [125]. If the new domain differs from the source domain mainly in content instead of acoustics, domain adaptation on E2E models can be performed by either interpolating the E2E model with an external language model or updating language model related components inside the E2E model with the textto-speech audio generated from the text in the new domain [126], [127], discussed in Sec. XII.…”
Section: Adaptation Algorithms For Nn-based Asrmentioning
confidence: 99%
See 1 more Smart Citation