Analysis of syllable duration models for Mandarin speech

Lai, Wen-Hsing; Chen, Sin-Homg

doi:10.1109/icassp.2002.5743763

Cited by 6 publications

(3 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When performing M-step, the new estimate is obtained by solving (16) However, we could not derive the closed-form solution to new estimate . The Newton's algorithm [1] is applied to iteratively reach the optimal solution (17) where (18) In [36], MAP adaptation of Gaussian duration parameters was proposed. Gamma priors were adopted to derive MAP estimates of Gaussian mean.…”

Section: B Map Estimation For Gamma Duration Parametersmentioning

confidence: 99%

“…N-best hypotheses of fast speech could be properly selected to improve large vocabulary continuous speech recognition performance [9]. Except the effects on speech recognition, speaking rate served as an important prosody feature in text-to-speech system [18]. Synthesized speech with user-adaptive speaking rate provided good naturalness for human listeners [34].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Bayesian learning of speech duration models

Chien

Huang

2003

IEEE Trans. Speech Audio Process.

View full text Add to dashboard Cite

This paper presents the Bayesian speech duration modeling and learning for hidden Markov model (HMM) based speech recognition. We focus on the sequential learning of HMM state duration using quasi-Bayes (QB) estimate. The adapted duration models are robust to nonstationary speaking rates and noise conditions. In this study, the Gaussian, Poisson, and gamma distributions are investigated to characterize the duration models. The maximum a posteriori (MAP) estimate of gamma duration model is developed. To exploit the sequential learning, we adopt the Poisson duration model incorporated with gamma prior density, which belongs to the conjugate prior family. When the adaptation data are sequentially observed, the gamma posterior density is produced with twofold advantages. One is to determine the optimal QB duration parameter, which can be merged in HMMs for speech recognition. The other one is to build the updating mechanism of gamma prior statistics for sequential learning. EM algorithm is applied to fulfill QB parameter estimation. The adaptation of overall HMM parameters can be performed simultaneously. In the experiments, the proposed adaptive duration model improves the speech recognition performance of Mandarin broadcast news and noisy connected digits. The batch and sequential learning are respectively investigated for MAP and QB duration models.

show abstract

Section: B Map Estimation For Gamma Duration Parametersmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Bayesian learning of speech duration models

Chien

Huang

2003

IEEE Trans. Speech Audio Process.

View full text Add to dashboard Cite

show abstract

“…In this paper we try to partially solve the problem via extending the conventional state duration method to further consider the speaking rate of the testing utterance and adding a new syllable duration model in an HMM -based Mandarin base-syllable recognizer for improving its performance. The proposed statistical duration modeling method has been tried on Min-Nan and Mandarin Chinese text -to-speech system and got some significant improvements on the duration prediction [4,5].…”

Section: Introductionmentioning

confidence: 99%

Duration modeling for Mandarin speech recognition using prosodic information

Wang,

Lee

2004

Speech Prosody 2004

View full text Add to dashboard Cite

In this paper, a new duration modeling method for HMMbased Mandarin base-syllable recognition is proposed. It extends the conventional state duration method to further consider the speaking rate of utterance and add a syllable duration model to help the recognition search finding the bestrecognized base-syllable string. Experimental results showed that the proposed method was effective on improving the recognition accuracy.

show abstract

An improved HMM speech recognition model

Yuan

2008

2008 International Conference on Audio, Language and Image Processing

View full text Add to dashboard Cite

Analysis of syllable duration models for Mandarin speech

Cited by 6 publications

References 7 publications

Bayesian learning of speech duration models

Bayesian learning of speech duration models

Duration modeling for Mandarin speech recognition using prosodic information

An improved HMM speech recognition model

Contact Info

Product

Resources

About