2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014
DOI: 10.1109/icassp.2014.6855086
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual deep neural network based acoustic modeling for rapid language adaptation

Abstract: This paper presents a study on multilingual deep neural network (DNN) based acoustic modeling and its application to new languages. We investigate the effect of phone merging on multilingual DNN in context of rapid language adaptation. Moreover, the combination of multilingual DNNs with Kullback-Leibler divergence based acoustic modeling (KL-HMM) is explored.Using ten different languages from the Globalphone database, our studies reveal that crosslingual acoustic model transfer through multilingual DNNs is sup… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
73
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 113 publications
(75 citation statements)
references
References 17 publications
0
73
0
Order By: Relevance
“…Multilingual training of deep neural network (DNN)-based ASR systems has provided some improvements in the automatic recognition of both low-and high-resourced languages [13][14][15][16][17][18][19][20][21][22]. Some of these techniques incorporate multilingual DNNs for feature extraction [13,18,23,24].…”
Section: Introductionmentioning
confidence: 99%
“…Multilingual training of deep neural network (DNN)-based ASR systems has provided some improvements in the automatic recognition of both low-and high-resourced languages [13][14][15][16][17][18][19][20][21][22]. Some of these techniques incorporate multilingual DNNs for feature extraction [13,18,23,24].…”
Section: Introductionmentioning
confidence: 99%
“…As pointed earlier in Section 1, in phone-based ASR the multilingual ANN can be simply adapted by replacing the last layer with clustered CD phones of TL [14,16]. A similar approach could be employed in our scenario in which the acoustic units are the clustered CD graphemes of TL.…”
Section: Comparison To Related Approachesmentioning
confidence: 99%
“…The multilingual acoustic model is then adapted on target language (TL) data based on a deterministic lexical model learned on TL data. The adaptation process can also involve redefinition of acoustic unit space based on TL data [14,15,16]. In the absence of lexical resources in the literature, typically graphemes are used as subword units [17,18,19].…”
Section: Introductionmentioning
confidence: 99%
“…Meanwhile, DNN is also trained to model one single universal multilingual senone set. Phones of multiple languages are all explicitly mapped to a universal phone set (e.g., IPA) [6,9]. Thus there is sufficient data to train the universal phones.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, the hidden layers can be trained jointly using data from multiple languages to benefit each other [3,5]. The target of the multilingual DNN can be either the universal International Phonetic Alphabet (IPA) based multilingual senones [6] or a layer consisting of separate activations for each language [3,7,8].…”
Section: Introductionmentioning
confidence: 99%