2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016
DOI: 10.1109/icassp.2016.7472632
|View full text |Cite
|
Sign up to set email alerts
|

Bottleneck linear transformation network adaptation for speaker adaptive training-based hybrid DNN-HMM speech recognizer

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 21 publications
0
3
0
Order By: Relevance
“…However, a large network is obviously undesirable from the viewpoints of computational load and memory size; it is also unfavorable from the viewpoint of controlling training robustness to unseen data. To meet this requirement for finding a small, necessary, and sufficient DNN structure, several approaches have reshaped the network structure [3,4,5] or pruned the network nodes [6]. However, these methods assumed retraining or adapting a size-reduced network for high discriminative power.…”
Section: Introductionmentioning
confidence: 99%
“…However, a large network is obviously undesirable from the viewpoints of computational load and memory size; it is also unfavorable from the viewpoint of controlling training robustness to unseen data. To meet this requirement for finding a small, necessary, and sufficient DNN structure, several approaches have reshaped the network structure [3,4,5] or pruned the network nodes [6]. However, these methods assumed retraining or adapting a size-reduced network for high discriminative power.…”
Section: Introductionmentioning
confidence: 99%
“…In model-based speaker adaptive training this may be done by splitting the weights of the acoustic model into a speaker-independent and a speaker-dependent set. During training, a copy of the speaker-dependent weights is maintained and optimised for each speaker separately [12,13,14]. Here, we take an alternative approach: Instead of maintaining and optimising a separate copy of speaker-dependent weights for each speaker we embed speaker adaptation directly into the acoustic model training using a meta-learning approach in order to find a good initialisation for speaker-dependent weights.…”
Section: Speaker Adaptive Training As a Meta-learning Taskmentioning
confidence: 99%
“…In model-based speaker adaptive training, the acoustic model is parameterised as speakerdependent and speaker-independent weights. A copy of the speaker-dependent weights is maintained and optimised separately for each speaker during the training process in order to factor out speaker variation from the canonical speakerindependent acoustic model [12,13,14]. Finally, all hybrid approaches can be considered as speaker adaptive training because they provide information about speaker identity, which allows the acoustic model to easily remove speaker variation from the input features [9,10,15].…”
Section: Introductionmentioning
confidence: 99%