5th International Conference on Spoken Language Processing (ICSLP 1998) 1998
DOI: 10.21437/icslp.1998-575
|View full text |Cite
|
Sign up to set email alerts
|

Natural-sounding speech synthesis using variable-length units

Abstract: The goal of this work was to develop a speech synthesis system which concatenates variable-length units to create natural sounding speech. Our initial work in this area showed that by careful design of system responses to ensure consistent intonation contours, natural-sounding speech synthesis was achievable with wordand phraselevel concatenation. In order to extend the flexibility of this framework, subsequent work focused on the problem of generating novel words from a pre-recorded corpus of sub-word units. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
2
0

Year Published

2001
2001
2014
2014

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(5 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…The past research revealed that more natural-sounding speech is obtained if prosodic information is included in unit selection [7]. However, signal processing techniques employed to do prosodic modifications were reported to reduce the quality of synthesized speech [1,8,29].…”
Section: Source Of Prosody In Concatenative Ttsmentioning
confidence: 99%
See 2 more Smart Citations
“…The past research revealed that more natural-sounding speech is obtained if prosodic information is included in unit selection [7]. However, signal processing techniques employed to do prosodic modifications were reported to reduce the quality of synthesized speech [1,8,29].…”
Section: Source Of Prosody In Concatenative Ttsmentioning
confidence: 99%
“…Past research has shown that natural-sounding synthetic speech can be produced by selecting non-uniform units (i.e. units of variable length) from large speech databases [1,2,6,29,17,20]. Studies [2,6,29,25] indicate that the naturalness of synthetic speech can be improved by excising longer sequences of recorded speech from the database, this reduces the number of concatenation points in a synthesized utterance.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The original synthesis corpus [6] was converted by lowering the fundamental frequency by 30% and the spectrum (formants) by 25%. Thus, the spectral envelope was interpolated by a factor of ¦ ¦ )( 10 )2 ¤ ( 3 .…”
Section: Male Female Conversionmentioning
confidence: 99%
“…We focus on Cantonese, a major Chinese dialect predominant in Hong Kong, South China and many overseas Chinese communities. The corpus-based concatenation technique has been gaining popularity in speech synthesis [2][3][4][5][6] due to its ability to achieve a high degree of naturalness. The use of corpus-based syllable concatenation is particularly suitable for Chinese, since the language is monosyllabic in nature.…”
Section: Introductionmentioning
confidence: 99%