Findings of the Association for Computational Linguistics: ACL 2022 2022
DOI: 10.18653/v1/2022.findings-acl.166
|View full text |Cite
|
Sign up to set email alerts
|

Zero-shot Learning for Grapheme to Phoneme Conversion with Language Ensemble

Abstract: Grapheme-to-Phoneme (G2P) has many applications in NLP and speech fields. Most existing work focuses heavily on languages with abundant training datasets, which limits the scope of target languages to less than 100 languages. This work attempts to apply zero-shot learning to approximate G2P models for all lowresource and endangered languages in Glottolog (about 8k languages). For any unseen target language, we first build the phylogenetic tree (i.e. language family tree) to identify top-k nearest languages for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 6 publications
(10 citation statements)
references
References 17 publications
0
5
0
Order By: Relevance
“…Other parameters follow the original literature [24]. For the pronunciation model, we use the multilingual model proposed in the previous literature and its implementation [30] 1 . For the language model, we first download the complete dataset from Crúbadán's website [33], which results in 1909 languages after cleaning.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Other parameters follow the original literature [24]. For the pronunciation model, we use the multilingual model proposed in the previous literature and its implementation [30] 1 . For the language model, we first download the complete dataset from Crúbadán's website [33], which results in 1909 languages after cleaning.…”
Section: Methodsmentioning
confidence: 99%
“…However, the majority of the languages do not have any accessible dictionaries or rules, therefore we consider an approximated pronunciation model δpm instead. In particular, we apply a recently proposed multilingual G2P model as our pronunciation model [30]. For any target language ltarget, this G2P model selects top-k nearest languages: ltopk ∈ KNN(ltarget) whose training set is available, then during the inference, it first propose k hypothesis using each nearest language model δ l topk , the models are ensembled by combining hypothesis into a lattice to emit the most-likely approximated sequence:…”
Section: Pronunciation Modelmentioning
confidence: 99%
“…In this study, we select the similar languages for the cross-lingual transfer based on language family defined in Glottolog [43]. Language selection based on language family is used in previous studies [44], [4] for spoken language processing tasks. Specifically, as mentioned in § IV-A, we choose French and Spanish from the same Romance languages for adaptation to Italian, while we use Malayalam and Telugu from the same Dravidian languages for adaptation to Tamil.…”
Section: B Text-based Adaptationmentioning
confidence: 99%
“…Recently deep learning architectures for g2p developed based on recurrent neural networks [33], convolutional neural networks [34] and transformers [35]. Zero shot g2p learning techniques without explicit training data have been proposed, but they are based on the assumption that similar language families use the same orthography, which is not always true [36]. Phonetisaurus [37] is a data driven tool that learns the mapping rules statistically (joint sequence models) from a training dataset and builds weighted FSTs for g2p conversion.…”
Section: Related Workmentioning
confidence: 99%
“…For languages with regular grapheme to phoneme conversion patterns, knowledge-based g2p has been reported to produce good results [36], [38]. A set of sequential rewrite rules can be used to achieve this.…”
Section: Related Workmentioning
confidence: 99%