2011
DOI: 10.1016/j.specom.2011.01.004
|View full text |Cite
|
Sign up to set email alerts
|

Efficient and reliable perceptual weight tuning for unit-selection text-to-speech synthesis based on active interactive genetic algorithms: A proof-of-concept

Abstract: Unit-selection speech synthesis is one of the current corpus-based text-tospeech synthesis techniques. The quality of the generated speech depends on the accuracy of the unit selection process, which in turn relies on the cost function definition. This function should map the user perceptual preferences when selecting synthesis units, which is still an open research issue. This paper proposes a complete methodology for the tuning of the cost function weights by fusing the human judgments with the cost function… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 12 publications
(2 citation statements)
references
References 28 publications
(65 reference statements)
0
2
0
Order By: Relevance
“…This block retrieves the units that minimise the prosodic, linguistic, and concatenation costs (see [42] for more details). The weights for the prosodic target and concatenation subcosts were perceptually tuned by means of active interactive genetic algorithms for speech synthesis purposes [44].…”
Section: Text-to-speech Subsystemmentioning
confidence: 99%
“…This block retrieves the units that minimise the prosodic, linguistic, and concatenation costs (see [42] for more details). The weights for the prosodic target and concatenation subcosts were perceptually tuned by means of active interactive genetic algorithms for speech synthesis purposes [44].…”
Section: Text-to-speech Subsystemmentioning
confidence: 99%
“…Therefore, this method involves less signal processing or no signal processing. Unit selection method is popular attributable to its high intelligibility and naturalness of output speech (Alias et al, 2011). However, demands larger database for better quality (BarraChicote et al, 2010).…”
Section: Comparison Of State-of-the-art Speech Synthesis Systemmentioning
confidence: 99%