2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 2007
DOI: 10.1109/icassp.2007.367298
|View full text |Cite
|
Sign up to set email alerts
|

Statistical Parametric Speech Synthesis

Abstract: This review gives a general overview of techniques used in statistical parametric speech synthesis. One instance of these techniques, called hidden Markov model (HMM)-based speech synthesis, has recently been demonstrated to be very effective in synthesizing acceptable speech. This review also contrasts these techniques with the more conventional technique of unit-selection synthesis that has dominated speech synthesis over the last decade. The advantages and drawbacks of statistical parametric synthesis are h… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
87
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 209 publications
(101 citation statements)
references
References 92 publications
0
87
0
Order By: Relevance
“…The HMM-based speech synthesis system (HTS) [1] models spectrum, F0 and duration simultaneously in the unified framework of HSMM. In the training stage, the output vector of the HSMM consists of a spectrum part and an F0 part.…”
Section: Statistical Speech Synthesismentioning
confidence: 99%
See 1 more Smart Citation
“…The HMM-based speech synthesis system (HTS) [1] models spectrum, F0 and duration simultaneously in the unified framework of HSMM. In the training stage, the output vector of the HSMM consists of a spectrum part and an F0 part.…”
Section: Statistical Speech Synthesismentioning
confidence: 99%
“…Recent advances in the field of statistical speech synthesis [1], have considerably reduced the gap between basic techniques used in automatic speech recognition (ASR) and text to speech (TTS). Feature types, feature dimensionality, duration and pitch modeling are a few of the key differences between the recognition and synthesis models [2].…”
Section: Introductionmentioning
confidence: 99%
“…Statistical parametric speech synthesis (SPSS) has dominated speech synthesis research area over the last decade [1,2]. It is mainly due to SPSS advantages over traditional concatenative speech synthesis approaches; these advantages include the flexibility to change voice characteristics [3][4][5], multilingual support [6][7][8], coverage of acoustic space [1], small footprint [1], and robustness [4,9].…”
Section: Introductionmentioning
confidence: 99%
“…Every SPSS system consists of two distinct phases, namely training and synthesis [1,2]. In the training phase, first acoustic and contextual factors are extracted for the whole training database using a vocoder [12,29,30] and a natural language pre-processor.…”
Section: Introductionmentioning
confidence: 99%
“…By using parameter generation algorithm [2], spectral and excitation parameters are generated from the sentence HMM. Finally, by using a synthesis filter, speech is synthesized from the generated spectral and excitation parameters [7], [16] and [17]. Spectral and excitation parameters are needed for any synthesis filter to generate speech waveforms so both must be modeled by HMMs.…”
Section: Introductionmentioning
confidence: 99%