Interspeech 2009 2009
DOI: 10.21437/interspeech.2009-20
|View full text |Cite
|
Sign up to set email alerts
|

Cross-language bootstrapping for unsupervised acoustic model training: rapid development of a Polish speech recognition system

Abstract: This paper describes the rapid development of a Polish language speech recognition system. The system development was performed without access to any transcribed acoustic training data. This was achieved through the combined use of cross-language bootstrapping and confidence based unsupervised acoustic model training. A Spanish acoustic model was ported to Polish, through the use of a manually constructed phoneme mapping. This initial model was refined through iterative recognition and retraining of the untran… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2011
2011
2022
2022

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 34 publications
(9 citation statements)
references
References 12 publications
0
9
0
Order By: Relevance
“…Past research [5,6,7] addressed this problem, finding that existing resources for other languages can be leveraged to pretrain, or bootstrap, an acoustic model, and then adapt it to the target language, given a small quantity of adaptation data.…”
Section: Introductionmentioning
confidence: 99%
“…Past research [5,6,7] addressed this problem, finding that existing resources for other languages can be leveraged to pretrain, or bootstrap, an acoustic model, and then adapt it to the target language, given a small quantity of adaptation data.…”
Section: Introductionmentioning
confidence: 99%
“…Unsupervised training is commonly associated in the literature with transcribing speech in a language A, using an ASR system trained in a language B, as in Ragni et al (2014); Lööf et al (2009); Qian et al (2013). However, this need not be the case.…”
Section: Unsupervised Trainingmentioning
confidence: 99%
“…As an alternative to such training of an ASR system from IL speech, we opted for a transfer learning paradigm and started with models trained on one or more higher-resource language(s). Other previous approaches [5,6,7,8] have explored cross-language ASR transfer Figure 1: Using the English SF-Type classifier to obtain adaptation/training data assuming shared phonemic representations, generally using the GlobalPhone corpus [9], while [10] examines multilingual training of a deep neural networks. Unlike these approaches, which had on the order of hours of target language speech, we are dealing with only minutes of adaptation speech.…”
Section: A Small Amount Of Il-english Parallel Textmentioning
confidence: 99%