6th Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU 2018) 2018
DOI: 10.21437/sltu.2018-14
|View full text |Cite
|
Sign up to set email alerts
|

A Step-by-Step Process for Building TTS Voices Using Open Source Data and Frameworks for Bangla, Javanese, Khmer, Nepali, Sinhala, and Sundanese

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 26 publications
(24 citation statements)
references
References 4 publications
0
24
0
Order By: Relevance
“…Our work utilizes publicly available datasets: LJSpeech [50], an English speech corpus with a total duration of about 24 hours; TITML-IDN [51], an Indonesian (ID) speech corpus with an average of 43 minutes for each speaker; OpenSLR jv-ID [52], a Javanese (JV) speech corpus with an average of 10 minutes for each speaker; OpenSLR su-ID [52], a Sundanese (SU) speech corpus with an average of 7 minutes for each speaker. T2 model for Indonesian, Javanese, and Sundanese uses a subset of corpus consisting of one female speaker for each language as shown in Table 1.…”
Section: A Datasetmentioning
confidence: 99%
“…Our work utilizes publicly available datasets: LJSpeech [50], an English speech corpus with a total duration of about 24 hours; TITML-IDN [51], an Indonesian (ID) speech corpus with an average of 43 minutes for each speaker; OpenSLR jv-ID [52], a Javanese (JV) speech corpus with an average of 10 minutes for each speaker; OpenSLR su-ID [52], a Sundanese (SU) speech corpus with an average of 7 minutes for each speaker. T2 model for Indonesian, Javanese, and Sundanese uses a subset of corpus consisting of one female speaker for each language as shown in Table 1.…”
Section: A Datasetmentioning
confidence: 99%
“…In order to gather data for new languages, we use a questionnaire asking language consultants to describe all the ways written tokens in various domains can be verbalized (see also [10,11]). We then need to convert this information to a machine-readable format so that it can be used in verbalizer grammars.…”
Section: Verbalization Templatesmentioning
confidence: 99%
“…We then need to convert this information to a machine-readable format so that it can be used in verbalizer grammars. Initially [10,11], this was performed by populating a Thrax [14] grammar template. However, we have moved this system to Pynini [15], a Python library which inherits the functionality of Thrax and can use Python's extensive libraries and testing frameworks.…”
Section: Verbalization Templatesmentioning
confidence: 99%
See 2 more Smart Citations