Maximum Conditional Mutual Information Weighted Scoring for Speech Recognition

Omar, Mohamed Kamal; Ramaswamy, Ganesh N.

doi:10.1109/icassp.2006.1660011

Cited by 4 publications

(4 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This could bring the benefit of using a class specific covariance model while avoiding the overfitting risk associated with it. A similar work using class specific covariance model is [3] In [4] it has been shown that the posterior form of Gaussian HMM can be represented as an HCRF model. For the case of a pooled covariance HMM this simplifies to a CRF or log-linear model.…”

Section: Introductionmentioning

confidence: 99%

Log-linear optimization of second-order polynomial features with subsequent dimension reduction for speech recognition

Tahir¹,

Schlüter²,

Ney³

2011

Interspeech 2011

View full text Add to dashboard Cite

Section: Introductionmentioning

confidence: 99%

Log-linear optimization of second-order polynomial features with subsequent dimension reduction for speech recognition

Tahir¹,

Schlüter²,

Ney³

2011

Interspeech 2011

View full text Add to dashboard Cite

“…The parameters of the feature transformation matrix are calculated by LDA, but alternatively they could also be trained discriminatively. One such example is [4] where an iterative optimization is used to directly train a reduced dimension feature transformation matrix. Another one is [2]; here the transformation matrix is trained assuming unequal class covariances for Gaussian densities.…”

Section: Introductionmentioning

confidence: 99%

Generalized likelihood ratio discriminant analysis

Tahir¹,

Heigold²,

Plahl³

et al. 2009

2009 IEEE Workshop on Automatic Speech Recognition &Amp; Understanding

View full text Add to dashboard Cite

Linear Discriminant Analysis (LDA) has been established as an important means for dimension reduction and decorrelation in speech recognition. The major points of criticism of LDA are that it uses an ad hoc and non-discriminative training criterion, and that the estimation is performed in a separate preprocessing step. This paper presents a new discriminative training method for the estimation of (projecting) linear feature transforms. More precisely, the problem is formulated in the loglinear framework, resulting in a convex optimization problem. Experimental results are provided for a digit string recognition task to compare the performance and robustness of the proposed approach (in combination with ML or MMI optimized acoustic models) with conventional LDA. Also, first experiments for a large vocabulary task are presented.

show abstract

“…Early influential work along these lines involved data-driven methods for robust feature extraction [44] and filterbank design [45]- [47]. More recent methods include: 1) heteroscedastic linear discriminant analysis (HLDA) [48] and neighborhood component analysis [49] to learn informative low dimensional projections of high dimensional acoustic feature vectors; 2) stochastic gradient and second-order methods to tune parameters related to frequency warping and mel-scale filterbanks [50], [51]; 3) maximum-likelihood methods for speaker and environment adaptation [52], [53] that perform linear transformations of the acoustic feature space at test time; and 4) extensions of popular frameworks for discriminative training, such as minimum phone error [54] and maximum mutual information [55], to learn accuracy-improving transformations and projections of the acoustic feature space.…”

Section: Acoustic Feature Adaptationmentioning

confidence: 99%

Online Learning and Acoustic Feature Adaptation in Large-Margin Hidden Markov Models

Cheng

Sha

Saul

2010

IEEE J. Sel. Top. Signal Process.

View full text Add to dashboard Cite

Abstract-We explore the use of sequential, mistake-driven updates for online learning and acoustic feature adaptation in large-margin hidden Markov models (HMMs). The updates are applied to the parameters of acoustic models after the decoding of individual training utterances. For large-margin training, the updates attempt to separate the log-likelihoods of correct and incorrect transcriptions by an amount proportional to their Hamming distance. For acoustic feature adaptation, the updates attempt to improve recognition by linearly transforming the features computed by the front end. We evaluate acoustic models trained in this way on the TIMIT speech database. We find that online updates for large-margin training not only converge faster than analogous batch optimizations, but also yield lower phone error rates than approaches that do not attempt to enforce a large margin. Finally, experimenting with different schemes for initialization and parameter-tying, we find that acoustic feature adaptation leads to further improvements beyond the already significant gains achieved by large-margin training.Index Terms-Acoustic feature adaptation, automatic speech recognition (ASR), discriminative training, hidden Markov models (HMMs), large-margin classification, online learning.

show abstract

Maximum Conditional Mutual Information Weighted Scoring for Speech Recognition

Cited by 4 publications

References 8 publications

Log-linear optimization of second-order polynomial features with subsequent dimension reduction for speech recognition

Log-linear optimization of second-order polynomial features with subsequent dimension reduction for speech recognition

Generalized likelihood ratio discriminant analysis

Online Learning and Acoustic Feature Adaptation in Large-Margin Hidden Markov Models

Contact Info

Product

Resources

About