2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
DOI: 10.1109/icassp.2003.1198879
|View full text |Cite
|
Sign up to set email alerts
|

Recent improvements to the IBM trainable speech synthesis system

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
22
0

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 33 publications
(24 citation statements)
references
References 6 publications
0
22
0
Order By: Relevance
“…The interest of intonation models in TTS has lessened due to the use of synthesis techniques based on the speech-unit selection (Eide et al, 2003) or HMM synthesis (Tokuda et al, 2000). Nevertheless, the prediction of realistic target F0 contours is still useful for guiding the search of units in the corpus (Rodríguez and Campillo, 2006;Eide et al, 2003).…”
Section: Quality Of Synthetic Contoursmentioning
confidence: 99%
See 2 more Smart Citations
“…The interest of intonation models in TTS has lessened due to the use of synthesis techniques based on the speech-unit selection (Eide et al, 2003) or HMM synthesis (Tokuda et al, 2000). Nevertheless, the prediction of realistic target F0 contours is still useful for guiding the search of units in the corpus (Rodríguez and Campillo, 2006;Eide et al, 2003).…”
Section: Quality Of Synthetic Contoursmentioning
confidence: 99%
“…Nevertheless, the prediction of realistic target F0 contours is still useful for guiding the search of units in the corpus (Rodríguez and Campillo, 2006;Eide et al, 2003). A list of dictionariesD The ordered list of dictionaries provides a way to build a graph of classes which conveys schematic visual information about the intonation patterns found in the corpus and their corresponding labels of prosodic features (see Appendix B for an explanation and section 3.5 for a more detailed discussion of the use of this graph in the experiments for Spanish language).…”
Section: Quality Of Synthetic Contoursmentioning
confidence: 99%
See 1 more Smart Citation
“…Corpus-based concatenative approach to speech synthesis has been widely explored in the research community in recent years [1,2,3]. Intonation modeling, or generation of fundamental frequency (F0) contour plays a crucial role in synthesizing natural sounding speech from input text.…”
Section: Introductionmentioning
confidence: 99%
“…Target F0 contour is generated using the features extracted from input text, and it is used either to modify the pitch of selected synthesis units, or in the unit selection where the discrepancies between target F0 contour and the F0 values of the synthesis units to be selected are attempted to be made as small as possible in the overall cost minimization through a search in the space of all available synthesis units. There has been a number of efforts in the context of F0 contour generation for English speech synthesis in the past decade, such as dynamical system [4], linear regression-based approach [5], combination of parametric models with regression trees [6,7], and the combination of regression trees and kernel smoother [2].…”
Section: Introductionmentioning
confidence: 99%