[Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing 1991
DOI: 10.1109/icassp.1991.150395
|View full text |Cite
|
Sign up to set email alerts
|

An improved MMIE training algorithm for speaker-independent, small vocabulary, continuous speech recognition

Abstract: o r g e r a e r c h e informatique de M o n t r t a l (CRIM) MontrCal, C o m m u n i c a t i o n s Systems GroupABSTRACT In several speech recognition tasks, Maximum Mutual Information estimation (MMIE) of Hidden Markov Model (HMM) parameters can substantially improve recognition results [1,2]. However, it I S usually implemented using ally Practical rec gradient descent, which can have very slow convergence. Recently, Gopalakrishnan et nl [3] intr a reestimation More recently, a different formula for discrete… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0
1

Year Published

2001
2001
2014
2014

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 49 publications
(27 citation statements)
references
References 9 publications
(4 reference statements)
0
25
0
1
Order By: Relevance
“…In the last decade, much effort has been put in finding more efficient and reliable algorithms [3], [4], [8]- [11]. The de-facto standard to optimize discriminative HMMs in speech recognition is Extended Baum Welch (EBW) [12], [13] or more precisely empirical variants thereof [7], [10], [14].…”
Section: Introductionmentioning
confidence: 99%
“…In the last decade, much effort has been put in finding more efficient and reliable algorithms [3], [4], [8]- [11]. The de-facto standard to optimize discriminative HMMs in speech recognition is Extended Baum Welch (EBW) [12], [13] or more precisely empirical variants thereof [7], [10], [14].…”
Section: Introductionmentioning
confidence: 99%
“…Hence, although auxiliary functions may be computed for each of these, the difference of two lower-bounds is not itself a lowerbound and so standard EM cannot be used. To handle this problem, the extended Baum-Welch (EBW) criterion was proposed [64,129]. In this case, standard EM-like auxiliary functions are defined for the numerator and denominator but stability during re-estimation is achieved by adding scaled current model parameters to the numerator statistics.…”
Section: Parameter Estimationmentioning
confidence: 99%
“…Most applications of discriminative training methods for speech recognition use either the maximum mutual information (MMI) (Bahl et al, 1986;Brown, 1987;Cardin et al, 1993;Chow, 1990;Kapadia et al, 1993;Normandin, 1996;Normandin et al, 1994a,b;Normandin and Morgera, 1991;Reichl and Ruske, 1995;Valtchev et al, 1996Valtchev et al, , 1997 or the minimum classi®cation error (MCE) (Chou et al, 1992(Chou et al, , 1993(Chou et al, , 1994Paliwal et al, 1995;Reichl and Ruske, 1995) criterion. In MCE training, an approximation to the error rate on the training data is optimized, whereas MMI training optimizes the a posteriori probability of the training utterances and hence the class separability.…”
Section: Introductionmentioning
confidence: 99%
“…EB is an extension to the standard Baum±Welch algorithm designed for optimization of the MMI criterion. EB was ®rst developed for discriminative training of discrete probabilities (Cardin et al, 1993;Gopalakrishnan et al, 1991;Normandin et al, 1994a;Normandin and Morgera, 1991), but was later extended to continuous densities (Normandin, 1991(Normandin, , 1996. Optimization of the MCE criterion is usually performed in combination with GD.…”
Section: Introductionmentioning
confidence: 99%