2003
DOI: 10.1109/tsa.2003.818114
|View full text |Cite
|
Sign up to set email alerts
|

Bayesian learning of speech duration models

Abstract: This paper presents the Bayesian speech duration modeling and learning for hidden Markov model (HMM) based speech recognition. We focus on the sequential learning of HMM state duration using quasi-Bayes (QB) estimate. The adapted duration models are robust to nonstationary speaking rates and noise conditions. In this study, the Gaussian, Poisson, and gamma distributions are investigated to characterize the duration models. The maximum a posteriori (MAP) estimate of gamma duration model is developed. To exploit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0

Year Published

2006
2006
2016
2016

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(4 citation statements)
references
References 25 publications
0
3
0
Order By: Relevance
“…In the MCDC8, the duration of ordinary syllables ranges from 15 to 1110 ms (mean, 173 ms) and is equal to 5.78 syllables per second, which is faster than the articulation rate of news reporters in a Chinese Broadcast News corpus (Chien and Huang, 2003). We ran the variant selection algorithm on all disyllabic words in the MCDC8.…”
Section: A Disyllabic Words In Conversational Speechmentioning
confidence: 99%
“…In the MCDC8, the duration of ordinary syllables ranges from 15 to 1110 ms (mean, 173 ms) and is equal to 5.78 syllables per second, which is faster than the articulation rate of news reporters in a Chinese Broadcast News corpus (Chien and Huang, 2003). We ran the variant selection algorithm on all disyllabic words in the MCDC8.…”
Section: A Disyllabic Words In Conversational Speechmentioning
confidence: 99%
“…The phone duration modeling approaches are divided in two major categories: The rule-based (Klatt, 1979) and the data-driven methods (Mobius and Santen, 1996;Santen, 1992;Chen et al, 1998;Chien and Huang, 2003;Lazaridis et al, 2007). In the rulebased methods manually produced rules, extracted from experimental studies on large sets of utterances or based on previous knowledge, are utilized for determining the duration of segments.…”
Section: Introductionmentioning
confidence: 99%
“…Over the last years various statistical methods have been applied in the phone duration modeling task such as, Linear Regression (LR) (Takeda et al, 1989), decisions tree-based models (Mobius and Santen, 1996), Sums-Of-Products (SOP) (Santen, 1992). Artificial Neural Networks (ANN) techniques (Chen et al, 1998), Bayesian models (Chien and Huang, 2003) and instance-based algorithms (Lazaridis et al, 2007) have also been introduced on the phone duration modeling task. Consequently the data-driven approaches offer us the ability to overcome the time consuming labor of the manual extraction of the rules which are needed in the rule-based approaches.…”
Section: Introductionmentioning
confidence: 99%
“…Alternatively, besides following the formal transition probability estimation for HMM, the lack of distinct duration modelling for non-stationary SRs may be addressed by SR dependent HMM [Anastasakos et al., ] [Zheng et al, 2003 or the estimation of transition dependent probability distributions modelling discrete duration length [Chien & Huang, 2003].…”
Section: Initial Acoustic Model Estimationmentioning
confidence: 99%