2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)
DOI: 10.1109/icassp.2000.862024
|View full text |Cite
|
Sign up to set email alerts
|

Tandem connectionist feature extraction for conventional HMM systems

Abstract: Hidden Markov model speech recognition systems typically use Gaussian mixture models to estimate the distributions of decorrelated acoustic feature vectors that correspond to individual subword units. By contrast, hybrid connectionist-HMM systems use discriminatively-trained neural networks to estimate the probability distribution among subword units given the acoustic observations. In this work we show a large improvement in word recognition performance by combining neural-net discriminative feature processin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
399
0
3

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 529 publications
(414 citation statements)
references
References 10 publications
2
399
0
3
Order By: Relevance
“…It is worth mentioning that KL-HMM was originally developed from the perspective of acoustic modeling , as an alternative to Tandem approach (Hermansky et al, 2000). However, as shown recently and briefly explained in this section, KL-HMM is a probabilistic modeling approach (Rasipuram and Magimai.-Doss, 2013a,b).…”
Section: Kullback-leibler Divergence Based Hmmmentioning
confidence: 99%
See 1 more Smart Citation
“…It is worth mentioning that KL-HMM was originally developed from the perspective of acoustic modeling , as an alternative to Tandem approach (Hermansky et al, 2000). However, as shown recently and briefly explained in this section, KL-HMM is a probabilistic modeling approach (Rasipuram and Magimai.-Doss, 2013a,b).…”
Section: Kullback-leibler Divergence Based Hmmmentioning
confidence: 99%
“…We investigate two systems, the first system uses standard cepstral features as feature observations (HMM/GMM system) and the second system uses Tandem features as feature observations (Tandem system) (Hermansky et al, 2000). As indicated in Table 1, Tandem system exploits both language-dependent and languageindependent resources similar to probabilistic lexical model based systems and acoustic model adaptation based systems.…”
Section: Standard Language-dependent Acoustic Model and Lexical Modelmentioning
confidence: 99%
“…Deep Boltzmann machines have been used as stacked autoencoders for feature extraction [20] or post-processing of local binary patterns from three orthogonal planes (LBP-TOP) [25]. These features are then classified using Support Vector Machines (SVMs) [20], where all utterance lengths have to be normalized, or using a tandem system [13], where the features are passed into a GMM-HMM recognizer [15,21,25]. Similarly, feature extraction has been performed by convolutional neural networks (CNNs) [21,16] and deep belief networks (DBNs) [15].…”
Section: Related Workmentioning
confidence: 99%
“…For PLP cepstral features, usually 9 frames of PLP coefficients and their first and second order derivatives are concatenated as the input for a trained MLP to estimate the posterior probabilities of context-independent phones [5]. The phonetic class is defined with respect to the center of 9 frames.…”
Section: Single Stream Posterior Estimationmentioning
confidence: 99%
“…This hierarchical approach provides a new, principled, theoretical framework for combining different streams of features taking into account context and model knowledge. We show that this method gives significant performance improvement over baseline PLP-TANDEM [5] and TRAP-TANDEM [11] techniques and also entropy based combination method [12] on OGI digits [13] and a reduced vocabulary version (1000 words) of CTS [6] databases.…”
Section: Introductionmentioning
confidence: 99%