Interspeech 2016 2016
DOI: 10.21437/interspeech.2016-1230
|View full text |Cite
|
Sign up to set email alerts
|

On the Use of Gaussian Mixture Model Framework to Improve Speaker Adaptation of Deep Neural Network Acoustic Models

Abstract: In this paper we investigate the GMM-derived (GMMD) features for adaptation of deep neural network (DNN) acoustic models. The adaptation of the DNN trained on GMMD features is done through the maximum a posteriori (MAP) adaptation of the auxiliary GMM model used for GMMD feature extraction. We explore fusion of the adapted GMMD features with conventional features, such as bottleneck and MFCC features, in two different neural network architectures: DNN and time-delay neural network (TDNN). We analyze and compar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2016
2016
2018
2018

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 77 publications
(124 reference statements)
0
7
0
Order By: Relevance
“…The use of GMM-derived (GMMD) features has been shown to provide an efficient technique of neural network AM adaptation for different adaptation tasks, such as speaker adaptation [27,30,40], environment or noise adaptation [31,32]. In this section, we describe a standard (Section 3.1) and improved (Section 3.2) SAT procedures for neural network AMs using GMMD features.…”
Section: Improved Sat Using Gmmd Featuresmentioning
confidence: 99%
See 1 more Smart Citation
“…The use of GMM-derived (GMMD) features has been shown to provide an efficient technique of neural network AM adaptation for different adaptation tasks, such as speaker adaptation [27,30,40], environment or noise adaptation [31,32]. In this section, we describe a standard (Section 3.1) and improved (Section 3.2) SAT procedures for neural network AMs using GMMD features.…”
Section: Improved Sat Using Gmmd Featuresmentioning
confidence: 99%
“…Characteristics of the obtained data sets are given in Table 1. A more detailed description of data can be found in [40]. For evaluation, a 4-gram language model (LM) with 152K word vocabulary was used.…”
Section: Data Setsmentioning
confidence: 99%
“…This idea led to the proposed method of adaptation based on GMM-derived features, designed in details in the papers [24][25][26]. The scheme of method application in its original variant (for speaker adaptation 2 ) is depicted on Fig.…”
Section: Adaptation Based On Gmm-derived Featuresmentioning
confidence: 99%
“…As a result, the main DNN-HMM acoustic model remains unchanged. Number of experiments described in [24][25][26] show that although SI-DNN-HMM on GMM-derived features works worse than on MFCCs, its adaptation is extremely efficient. The illustration of the last statement (with application to speaker adaptation) is shown on Fig.…”
Section: Adaptation Based On Gmm-derived Featuresmentioning
confidence: 99%
“…They are concatenated with conventional acoustic features and used for DNN training and decoding. The benefit of GMM-derived features has recently been shown in [26]- [28] in the context of speaker adaptation of DNN-based acoustic models. The authors in [29] also used GMM log-likelihoods as input features (without conventional acoustic features) for adaptation to stationary noise.…”
Section: Introductionmentioning
confidence: 99%