2009 IEEE International Conference on Acoustics, Speech and Signal Processing 2009
DOI: 10.1109/icassp.2009.4960448
|View full text |Cite
|
Sign up to set email alerts
|

Probablistic modelling of F0 in unvoiced regions in HMM based speech synthesis

Abstract: HMM based synthesis has attracted great interest due to its compact and flexible modelling of spectral and prosodic parameters. In this approach, short term spectra, fundamental frequency (F0) and duration are simultaneously modelled by multi-stream HMMs. However, since F0 values in unvoiced regions are normally considered as undefined, it is difficult to use standard HMMs for F0 modelling. The currently preferred solution to this is to use a multi-space distribution HMM (MSDHMM) in which discrete distribution… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
20
0
1

Year Published

2009
2009
2015
2015

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 18 publications
(22 citation statements)
references
References 4 publications
1
20
0
1
Order By: Relevance
“…By generating real F0 values for unvoiced regions and assuming hidden voicing labels, the Continuous F0 model with Globally Tied Distribution [6], CF-GTD in figure 1(b), is obtained. The state output distribution can be expressed as…”
Section: Comparison Of F0 Modelling Approaches For Hmm Based Speech Smentioning
confidence: 99%
See 4 more Smart Citations
“…By generating real F0 values for unvoiced regions and assuming hidden voicing labels, the Continuous F0 model with Globally Tied Distribution [6], CF-GTD in figure 1(b), is obtained. The state output distribution can be expressed as…”
Section: Comparison Of F0 Modelling Approaches For Hmm Based Speech Smentioning
confidence: 99%
“…Experiments have shown that it can greatly reduce the F0 trajectory modelling error, and consequently improve the naturalness of the synthesised speech [6,7]. However, due to hidden voicing labels, voicing classification only relies on the statistical difference between the globally tied unvoiced component and the state specific voiced component.…”
Section: Comparison Of F0 Modelling Approaches For Hmm Based Speech Smentioning
confidence: 99%
See 3 more Smart Citations