2016
DOI: 10.1016/j.procs.2016.04.049
|View full text |Cite
|
Sign up to set email alerts
|

Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2016
2016
2018
2018

Publication Types

Select...
5

Relationship

2
3

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 22 publications
0
6
0
Order By: Relevance
“…The corpus has both male and female speakers and is quite mixed: Most datasets are single-speaker, while others, like Bangladeshi Bengali and Icelandic, have recordings from multiple speakers [9]. The recording conditions vary: some speakers were recorded in anechoic chambers, while others in a regular recording studio setup or on university campuses.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The corpus has both male and female speakers and is quite mixed: Most datasets are single-speaker, while others, like Bangladeshi Bengali and Icelandic, have recordings from multiple speakers [9]. The recording conditions vary: some speakers were recorded in anechoic chambers, while others in a regular recording studio setup or on university campuses.…”
Section: Methodsmentioning
confidence: 99%
“…However, this approach may not apply for under-resourced languages, when no source corpora are available. In this scenario, crowd-sourcing the data from multiple speakers and building an average voice is possible [9]. For the majority of the world's languages, in the long tail of the distribution [10], even these approaches may not be feasible, due to the lack of sufficient audio data, linguistic resources, or adequate infrastructure [11].…”
Section: Introductionmentioning
confidence: 99%
“…There have been numerous attempts at solving this data bottleneck. Text-to-speech voices have been successfully trained on crowdsourced speech data [2], on filtered acoustic data for training speech recognition systems [3], or commercial audiobooks [4]. New voices have also been created using voice adaptation techniques [5] or voice morphing [6].…”
Section: Introductionmentioning
confidence: 99%
“…Recent work has been successful at developing text-to-speech (TTS) resources and systems for under-resourced languages by combining inexpensive methods such as crowd-sourcing and bootstrapping with powerful statistical acoustic modelling techniques [1]. The South African (SA) context with 11 official languages, most of them under-resourced but some closely related, provides an interesting scenario for further investigating rapid development of practical systems with similar efficient techniques.…”
Section: Introductionmentioning
confidence: 99%