Jon Rong-Wei Yi scite author profile

The goal of this work was to develop a speech synthesis system which concatenates variable-length units to create natural sounding speech. Our initial work in this area showed that by careful design of system responses to ensure consistent intonation contours, natural-sounding speech synthesis was achievable with wordand phraselevel concatenation. In order to extend the flexibility of this framework, subsequent work focused on the problem of generating novel words from a pre-recorded corpus of sub-word units. The design of the sub-word units was motivated by perceptual experiments that investigated where speech could be spliced with minimal distortion and what contextual constraints were necessary to maintain in order to produce natural sounding speech. This sub-word corpus is then searched at synthesis time with a Viterbi search which selects a sequence of units based on how well they individually match the input specification and on how well they sound as an ensemble. This concatenative speech synthesis system, ENVOICE, has been used in a conversational information retrieval system in two application domains to convert meaning representations into speech waveforms.

show abstract

Information-theoretic criteria for unit selection synthesis

Yi¹,

Glass²

2002

View full text Add to dashboard Cite

Concatenative speech synthesis using a finite-state transducer

Yi¹

2007

J. Acoust. Soc. Am.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jon Rong-Wei Yi

A flexible, scalable finite-state transducer architecture for corpus-based concatenative speech synthesis

MUXING: a telephone-access Mandarin conversational system

Natural-sounding speech synthesis using variable-length units

Information-theoretic criteria for unit selection synthesis

Concatenative speech synthesis using a finite-state transducer

Contact Info

Product

Resources

About