Yehlin Lee scite author profile

Yehlin Lee

2Publications

55Citation Statements Received

12Citation Statements Given

How they've been cited

113

How they cite others

Affiliations

Institute of Linguistics, Academia Sinica

Publications

Order By: Most citations

Fluent speech prosody: Framework and modeling

Tseng

Lee

et al. 2005

Speech Communication

112

View full text Add to dashboard Cite

The prosody of fluent connected speech is much more complicated than concatenating individual sentence intonations into strings. We analyzed speech corpora of read Mandarin Chinese discourses from a top-down perspective on perceived units and boundaries, and consistently identified speech paragraphs of multiple phrases that reflected discourse rather than sentence effects in fluent speech. Subsequent cross-speaker and cross-speaking-rate acoustic analyses of identified speech paragraphs revealed systematic cross-phrase prosodic patterns in every acoustic parameter, namely, F 0 contours, duration adjustment, intensity patterns, and in addition, boundary breaks. We therefore argue for a higher prosodic node that governs, constrains, and groups phrases to derive speech paragraphs. A hierarchical multi-phrase framework is constructed to account for the governing effect, with complimentary production and perceptual evidences. We show how cross-phrase F 0 and syllable duration patterns templates are derived to account for the tune and rhythm characteristic to fluent speech prosody, and argue for a prosody framework that specifies phrasal intonations as subjacent sister constituent subject to higher terms. Output fluent speech prosody is thus cumulative results of contributions from every prosodic layer. To test our framework, we further construct a modular prosody model of multiplephrase grouping with four corresponding acoustic modules and begin testing the model with speech synthesis. To conclude, we argue that any prosody framework of fluent speech should include prosodic contributions above individual sentences in production, with considerations of its perceptual effects to on-line processing; and development of unlimited TTS could benefit most appreciably by capturing and including cross-phrase relationships in prosody modeling. Ó 2005 Published by Elsevier B.V.

show abstract

A Mandarin TTS system with an integrated prosodic model

Pin¹,

Lee²,

Chen³

et al.

View full text Add to dashboard Cite

Phrase grouping is essential to characterize the prosody for Mandarin fluent speech. Evidence of prosodic phrase grouping has been found both in adjustments of F 0 contours and temporal allocations within and across phrases. In this paper, we discuss the development of a Mandarin TTS system that integrates the prosody processing modules, such as duration modeling, F 0 modeling, and break predictions. The database consists of 1292*3 syllable-tokens chopped off specially designed threephrase carrier sentences. Since temporal allocations and rhythmic structure in speech flow are carefully dealt with, the TTS system is capable of converting long paragraph text input into natural synthesized speech output.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yehlin Lee

Fluent speech prosody: Framework and modeling

A Mandarin TTS system with an integrated prosodic model

Contact Info

Product

Resources

About