2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) 2021
DOI: 10.1109/asru51503.2021.9688028
|View full text |Cite
|
Sign up to set email alerts
|

Kaizen: Continuously Improving Teacher Using Exponential Moving Average for Semi-Supervised Speech Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 28 publications
0
5
0
Order By: Relevance
“…transcription generated by some method. Different ways of inferring pseudo-labels PL(x; θ ) have been proposed [22,31,38,26,29,18,7], including both greedy and beam-search decoding, with or without an external LM, and with variants on the "teacher" AM model θ . IPL [38] and slimIPL [26] are continuous PL approaches, where a single AM (with parameters θ) is continuously trained.…”
Section: Acoustic (Am) and Language (Lm) Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…transcription generated by some method. Different ways of inferring pseudo-labels PL(x; θ ) have been proposed [22,31,38,26,29,18,7], including both greedy and beam-search decoding, with or without an external LM, and with variants on the "teacher" AM model θ . IPL [38] and slimIPL [26] are continuous PL approaches, where a single AM (with parameters θ) is continuously trained.…”
Section: Acoustic (Am) and Language (Lm) Modelsmentioning
confidence: 99%
“…The two dominant methods for leveraging unlabeled audio are unsupervised pre-training via selfsupervision (SSL) [6,19,11,4] and semi-supervised self-training [22,38,26,29,16,18], or pseudo-labeling (PL). In pre-training, a model is trained to process the raw unlabeled data to extract features that solve some pretext task, followed by supervised fine-tuning on some downstream ASR task.…”
Section: Introductionmentioning
confidence: 99%
“…This can be seen as alternative caching mechanism to [42] for exploiting older models. A similar approach to MPL was proposed in [59], which focused on lower-resource settings and conducted experiments on a hybrid ASR system in addition to a CTCbased end-to-end system. This paper thoroughly investigates MPL on its robustness against variations in domain mismatch severity and over-fitting to LM knowledge.…”
Section: B Pseudo-labeling With Multiple Iterationsmentioning
confidence: 99%
“…We adopt continuous PL (shown in Fig. 2c) [23,24] to compute the L ASR in both stage 1 and stage 2. Note that the continuous PL approach could also be used in other FL approaches like FedNorm and FedExtract.…”
Section: Unsupervised Training With Continuous Plmentioning
confidence: 99%
“…Then, some training burdens are moved to the server, thus reducing computation on clients. Additionally, DecoupleFL adopts pseudo-labeling (PL) approaches [23,24] for unsupervised learning, avoiding the unrealistic labeled data assumption. Moreover, one potential concern is communicating features might lead to privacy leakage.…”
Section: Introductionmentioning
confidence: 99%