2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012
DOI: 10.1109/icassp.2012.6288918
|View full text |Cite
|
Sign up to set email alerts
|

Creating synthetic voices for children by adapting adult average voice using stacked transformations and VTLN

Abstract: This paper describes experiments in creating personalised children's voices for HMM-based synthesis by adapting either an adult or child average voice. The adult average voice is trained from a large adult speech database, whereas the child average voice is trained using a small database of children's speech. Here we present the idea to use stacked transformations for creating synthetic child voices, where the child average voice is first created from the adult average voice through speaker adaptation using al… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(10 citation statements)
references
References 9 publications
0
10
0
Order By: Relevance
“…However, it was refreshing to discover studies focusing on other variants of English such as Irish English [41] and Indian English [19,42]. Although less common, researchers also considered other languages when experimenting with child-speech synthesis, including Norwegian [5], Spanish [18], Punjabi [43], Finnish [44], German [45,46], Czech and Slovak [47], Mandarin [27,31] and quite often, Italian [21,29,30,[48][49][50][51][52].…”
Section: Languagementioning
confidence: 99%
See 3 more Smart Citations
“…However, it was refreshing to discover studies focusing on other variants of English such as Irish English [41] and Indian English [19,42]. Although less common, researchers also considered other languages when experimenting with child-speech synthesis, including Norwegian [5], Spanish [18], Punjabi [43], Finnish [44], German [45,46], Czech and Slovak [47], Mandarin [27,31] and quite often, Italian [21,29,30,[48][49][50][51][52].…”
Section: Languagementioning
confidence: 99%
“…Interestingly, aside from creating a child voice by adapting an average adult voice or an average child voice, Karhila et al's study [44] compared two additional adaptation methods using stacked transformations: StA and StVA. In the first method, StA, an average voice trained from adult data was adapted using training data of the average child voice.…”
Section: Speech-synthesis Systemsmentioning
confidence: 99%
See 2 more Smart Citations
“…Another technique is adaptive voice conversion, which can be used to dub children's voices in children's movie applications. Techniques for creating or generating children's voices have been proposed in the study [12], [13], [14], [15]. Watts et al [14], [15] proposed the Hidden Markov Model (HMM) as a basis for a method of synthesizing children's voices.…”
Section: Introductionmentioning
confidence: 99%