Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-1990
|View full text |Cite
|
Sign up to set email alerts
|

Fast Language Adaptation Using Phonological Information

Abstract: Phoneme-based multilingual connectionist temporal classification (CTC) model is easily extensible to a new language by concatenating parameters of the new phonemes to the output layer. In the present paper, we improve cross-lingual adaptation in the context of phoneme-based CTC models by using phonological information. A universal (IPA) phoneme classifier is first trained on phonological features generated from a phonological attribute detector. When adapting the multilingual CTC to a new, never seen, language… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(13 citation statements)
references
References 19 publications
(24 reference statements)
0
13
0
Order By: Relevance
“…[17] applied the articulatory features (AFs) to Deep Bottleneck (DBN) features based ivector and xvector systems, which can get better performance than baseline. Because the AFs are language-independent features, there are many researches focused on multilingual speech recognition [18][19][20]. Previous studies generally take a bottom-up approach and train phonological feature detectors, and here we jointly train the phonological labels and the acoustic frames reconstruction.…”
Section: Related Workmentioning
confidence: 99%
“…[17] applied the articulatory features (AFs) to Deep Bottleneck (DBN) features based ivector and xvector systems, which can get better performance than baseline. Because the AFs are language-independent features, there are many researches focused on multilingual speech recognition [18][19][20]. Previous studies generally take a bottom-up approach and train phonological feature detectors, and here we jointly train the phonological labels and the acoustic frames reconstruction.…”
Section: Related Workmentioning
confidence: 99%
“…In addition, we developed a new approach to initialize the model parameters by incorporating phonological information. It was demonstrated that the proposed approach results in better and faster convergence in cross-lingual adaptation [Tong et al, 2018b].…”
Section: Main Contributionsmentioning
confidence: 99%
“…Moreover, a universal phoneme-based model is easily extensible to unseen phonemes when adapted to a new language [12]. With this motivation, and following our previous work [12,13], we propose a multilingual architecture that uses a universal output label set consisting of the union of all phonemes from the multiple languages. This universal phone set can be either derived in a data-driven way, or obtained from the International Phonetic Alphabet (IPA).…”
Section: Universal Phone Setmentioning
confidence: 99%
“…Multilingual ASR and cross-lingual adaptation can benefit more from these properties: language-specific prerequisite systems are no longer required; cross-lingual adaptation from an IPA-based system can be done simply by extending the output layer to new phonemes in a target language [12]. CTC training has been shown to be a promising alternative to the traditional DNN-HMM system for both multilingual ASR and cross-lingual adaptation [12,13].…”
Section: Introductionmentioning
confidence: 99%