Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-1954
|View full text |Cite
|
Sign up to set email alerts
|

Transformer Based Grapheme-to-Phoneme Conversion

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
31
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 44 publications
(33 citation statements)
references
References 17 publications
2
31
0
Order By: Relevance
“…Novak et al (2016) employ a joint multigram approach to generate weighted finite-state transducers for G2P. Recently, neural sequence-to-sequence models based on CNN and RNN architectures have been proposed for the G2P task delivering superior results compared to earlier non-neural approaches (Chae et al, 2018;Yolchuyeva et al, 2019a). Similar to our approach, Yolchuyeva et al (2019b) use transformers (Vaswani et al, 2017) to perform English G2P conversion.…”
Section: Related Workmentioning
confidence: 97%
“…Novak et al (2016) employ a joint multigram approach to generate weighted finite-state transducers for G2P. Recently, neural sequence-to-sequence models based on CNN and RNN architectures have been proposed for the G2P task delivering superior results compared to earlier non-neural approaches (Chae et al, 2018;Yolchuyeva et al, 2019a). Similar to our approach, Yolchuyeva et al (2019b) use transformers (Vaswani et al, 2017) to perform English G2P conversion.…”
Section: Related Workmentioning
confidence: 97%
“…Recurrent neural networks in a variety of models have been applied to the g2p problem, including LSTMs and bidirectional LSTMs (Rao et al, 2015), as well as convolutional networks (Yolchuyeva et al, 2019). The Transformer for g2p is investigated in and Yolchuyeva et al (2020), showing improvements over previous models, at least in high-resource settings. Low-resource settings for g2p in general are examined in Jyothi and Hasegawa-Johnson (2017), and a number of papers have experimented with high-resource to lowresource transfer learning (Schlippe et al, 2014; Deri and Knight, 2016), an avenue we did not explore in this work.…”
Section: Related Workmentioning
confidence: 99%
“…Many different approaches to G2P exist in the literature, including rule-based systems , LSTMs (Rao et al, 2015), jointsequence models (Galescu and Allen, 2002), and encoder-decoder architectures, based on convolutional neural networks (Yolchuyeva et al, 2019), LSTMs (Yao and Zweig, 2015), or transformers (Yolchuyeva et al, 2020;Sun et al, 2019). In this paper, we improve over previous work by exploring two straightforward extensions of a standard transformer (Vaswani et al, 2017) model for the task: multi-task training (Caruana, 1997) and ensembling.…”
Section: Related Workmentioning
confidence: 99%
“…We explore using a transformer model (Vaswani et al, 2017) for this problem, since it has shown great promise in several areas of natural language processing (NLP), outperforming the previous state of the art on a large variety of tasks, including machine translation (Vaswani et al, 2017), summarization (Raffel et al, 2019), question-answering (Raffel et al, 2019, and sentiment-analysis (Munikar et al, 2019). While previous work has used transformers for G2P, experiments have only been performed on English, specifically on the CMUDict (Weide, 2005) and NetTalk 1 datasets (Yolchuyeva et al, 2020;Sun et al, 2019). Our approach builds upon the standard architecture by adding two straightforward modifications: multi-task training (Caruana, 1997) and ensembling.…”
Section: Introductionmentioning
confidence: 99%