Hans Kruschke scite author profile

The generation of prosodic parameters such as FO contour, duration and intensity still remains an important issue for naturally-sounding text-to-speech ('ITS), although recently developed 'ITS systems have achieved a considerable pro gress. Several appropriate but language-specific rule-based, statistical or data-driven prosody models have been success fully realized in many systems. The language and param eter dependent models lead to a more complex and inef ficient TTS system design. In earlier works the authors proposed a hybrid data-driven and rule-based model, which can adjust different voices or speaking styles by learning and predicting proSodic parameters. The curr ent paper dis cusses the multilingual model generalization and the de sign of appropriate prosodic databases. Exemplary, two dif ferent languages: German and Mandarin Chinese are ex amined. Prediction results and perceptual evaluation with respect to FO contours and duration values are presented Since the perceptual results of both languages are compara ble and quite satisfying, the model is qualified for the multi lingual prosody control. Resynthesis stimuli obtained from modified prosodic parameters partly achieve ncar-to-natural mean opinion scores (MOS) above 4.0. The introduced hy brid data-driven and rule-based model is comparatively sim ple and enables a multilingual prosody control in ITS.

show abstract

Estimation of the parameters of the quantitative intonation model with continuous wavelet analysis

Kruschke¹,

Lenz²

2003

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hans Kruschke

A multilingual TTS system with less than 1 Mbyte footprint for embedded applications

Parameter extraction of a quantitative intonation model with wavelet analysis and evolutionary optimization

Learning the parameters of quantitative prosody models

Towards a multilingual prosody model for text-to-speech

Estimation of the parameters of the quantitative intonation model with continuous wavelet analysis

Contact Info

Product

Resources

About