Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-2246
|View full text |Cite
|
Sign up to set email alerts
|

Domain Adaptation Using Factorized Hidden Layer for Robust Automatic Speech Recognition

Abstract: Domain robustness is a challenging problem for automatic speech recognition (ASR). In this paper, we consider speech data collected for different applications as separate domains and investigate the robustness of acoustic models trained on multidomain data on unseen domains. Specifically, we use Factorized Hidden Layer (FHL) as a compact low-rank representation to adapt a multi-domain ASR system to unseen domains. Experimental results on two unseen domains show that FHL is a more effective adaptation method co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
17
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 32 publications
(19 citation statements)
references
References 32 publications
2
17
0
Order By: Relevance
“…Overfitting can be suppressed by using a lower learning rate and early stopping. This is consistent with the results reported in [12], where robust domain adaptation can be achieved by fine-tuning the entire model with millions of parameters on a small amount of adaptation data.…”
Section: Overfittingsupporting
confidence: 92%
“…Overfitting can be suppressed by using a lower learning rate and early stopping. This is consistent with the results reported in [12], where robust domain adaptation can be achieved by fine-tuning the entire model with millions of parameters on a small amount of adaptation data.…”
Section: Overfittingsupporting
confidence: 92%
“…There are a number of methods for adapting large ASR models to small amounts of data [12,13,14]. This paper's ap- Figure 1: Schematic diagrams of the RNN-T architecture (left) and the LAS architecture (right).…”
Section: Related Workmentioning
confidence: 99%
“…The highest performance of speaker identification achieved 97.65% using GMFCC with 14 features employing YOHO database. Meanwhile, the accuracy using [30], reported an investigation into the use of Factorized Hidden Layer (FHL) to achieve compact model adaptation to unseen domains. The authors found that the SVD to initialize the low-rank bases of an FHL model leads to a faster convergence and improved performance.…”
Section: Accuracy =mentioning
confidence: 99%