Interspeech 2017 2017
DOI: 10.21437/interspeech.2017-1259
|View full text |Cite
|
Sign up to set email alerts
|

Speech Synthesis for Mixed-Language Navigation Instructions

Abstract: Text-to-Speech (TTS) systems that can read navigation instructions are one of the most widely used speech interfaces today. Text in the navigation domain may contain named entities such as location names that are not in the language that the TTS database is recorded in. Moreover, named entities can be compound words where individual lexical items belong to different languages. These named entities may be transliterated into the script that the TTS system is trained on. This may result in incorrect pronunciatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 7 publications
0
6
0
Order By: Relevance
“…Then, the mapping between the phonemes of both languages is used to synthesize the text using a TTS system trained on a single language. In [10], Chandu et al further extends their code-mixed TTS to a bilingual system using two monolingual speech datasets and a combined phone set for speech synthesis of mixed-language navigation instructions. For deep neural network based speech synthesis, a cross-lingual TTS is built using Kullback-Leibler divergence [11].…”
Section: Related Workmentioning
confidence: 99%
“…Then, the mapping between the phonemes of both languages is used to synthesize the text using a TTS system trained on a single language. In [10], Chandu et al further extends their code-mixed TTS to a bilingual system using two monolingual speech datasets and a combined phone set for speech synthesis of mixed-language navigation instructions. For deep neural network based speech synthesis, a cross-lingual TTS is built using Kullback-Leibler divergence [11].…”
Section: Related Workmentioning
confidence: 99%
“…Synthesis of code mixed text using monolingual data [7,5] has been addressed primarily at the linguistic level: by either mapping the words/phones of the foreign language with the closest sounding phones of the native language or by using transliteration [8,9]. However, these methods have been shown to generate foreign accents [10,11,12].…”
Section: Synthesis Of Code Mixed Textmentioning
confidence: 99%
“…In the context of Text to Speech (TTS), voice deployed in such contexts has to be able to synthesize mixed text without ignoring the content from one of the languages. Typical approaches for building such mixed lingual voices require bilingual recordings [3,4,5]: speech data from the speaker in both native language as well as the additional language. However, obtaining such data might not always be feasible.…”
Section: Introductionmentioning
confidence: 99%
“…Mixed-lingual speech synthesis systems are trained based on different grapheme to phoneme conversion techniques, acoustic and prosodic modeling [10]. [11] attempts to obtain correct pronunciation of Indic words in navigation instructions. While [3,11] deal with multilingual text in the Romanized script, [7,9] synthesise multilingual text with words in their native script.…”
Section: Introductionmentioning
confidence: 99%
“…[11] attempts to obtain correct pronunciation of Indic words in navigation instructions. While [3,11] deal with multilingual text in the Romanized script, [7,9] synthesise multilingual text with words in their native script. In [12], experiments are conducted by code-mixing in the same script and mixed scripts.…”
Section: Introductionmentioning
confidence: 99%