2020
DOI: 10.1609/aaai.v34i05.6341
|View full text |Cite
|
Sign up to set email alerts
|

Towards Zero-Shot Learning for Automatic Phonemic Transcription

Abstract: Automatic phonemic transcription tools are useful for low-resource language documentation. However, due to the lack of training sets, only a tiny fraction of languages have phonemic transcription tools. Fortunately, multilingual acoustic modeling provides a solution given limited audio training data. A more challenging problem is to build phonemic transcribers for languages with zero training data. The difficulty of this task is that phoneme inventories often differ between the training languages and the targe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
16
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 24 publications
(17 citation statements)
references
References 18 publications
1
16
0
Order By: Relevance
“…More broadly there has been extensive work on domain adaptation and transfer learning in machine learning, reviewed by Kouw and Loog [46]. This includes work on few-shot learning [47]- [49] and normalizing flows [50], [51]. Normalizing flows which provide a probabilistic framework for feature transformations, were first developed for speech recognition as Gaussianization [52], and more recently have been applied to speech synthesis [53] and voice transformation [54].…”
Section: Adaptation and Transfer Learning In Related Fieldsmentioning
confidence: 99%
“…More broadly there has been extensive work on domain adaptation and transfer learning in machine learning, reviewed by Kouw and Loog [46]. This includes work on few-shot learning [47]- [49] and normalizing flows [50], [51]. Normalizing flows which provide a probabilistic framework for feature transformations, were first developed for speech recognition as Gaussianization [52], and more recently have been applied to speech synthesis [53] and voice transformation [54].…”
Section: Adaptation and Transfer Learning In Related Fieldsmentioning
confidence: 99%
“…One critical issue with most multilingual recognition models is that their phone coverage is hardly complete [19]. For example, our trained model could cover around 200 phones, whereas the Phoible inventory has around 2000 distinct phones.…”
Section: Phone Recognitionmentioning
confidence: 99%
“…Zero-shot transfer learning addresses this by training a single multilingual model on the labeled data of several languages to enable zero-shot transcription of unseen languages [16,17,18,19,17,20]. Models usually have a common encoder that extracts acoustic information from speech audio and then predict either a shared phoneme vocabulary [17,16] or language-specific phonemes [1,20,21]. The former requires either phonological units that are agnostic to any particular language such as articulatory features [20] or global phones [22,17].…”
Section: Introductionmentioning
confidence: 99%