1994
DOI: 10.1109/89.260364
|View full text |Cite
|
Sign up to set email alerts
|

Combining TDNN and HMM in a hybrid system for improved continuous-speech recognition

Abstract: The paper presents a hybrid continuous-speech recognition system that leads to improved results on the speaker dependent DARPA Resource Management task. This hybrid system, called the combined system, is based on a combination of normalized neural network output scores with hidden Markov model (HMM) emission probabilities. The neural network is trained under mean square error and the HMM is trained under maximum likelihood estimation. In theory, whatever criterion may be used, the same word error rate should b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

1996
1996
2013
2013

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(12 citation statements)
references
References 16 publications
0
12
0
Order By: Relevance
“…These investigations, mainly based on model averaging, showed some success when combing contextindependent hybrid systems based on multi-layer perceptrons (MLPs) and recurrent neural networks (RNNs). Dugast et al [16] combined posterior probability estimates obtained from a time-delay neural network with the likelihoods generated by an HMM system with state emissions modelled by a mixture of Laplacians. Similar approaches combining scaledlikelihoods produced by a two-layer MLP and HMM-GMM likelihoods were also investigated [17].…”
Section: Relation To Prior Workmentioning
confidence: 99%
“…These investigations, mainly based on model averaging, showed some success when combing contextindependent hybrid systems based on multi-layer perceptrons (MLPs) and recurrent neural networks (RNNs). Dugast et al [16] combined posterior probability estimates obtained from a time-delay neural network with the likelihoods generated by an HMM system with state emissions modelled by a mixture of Laplacians. Similar approaches combining scaledlikelihoods produced by a two-layer MLP and HMM-GMM likelihoods were also investigated [17].…”
Section: Relation To Prior Workmentioning
confidence: 99%
“…The proposed technique allowed for a signi"cant gain in performance for some classes of confusable letters (e.g.,`Ba,`Da, and`Va), from 491 spellings of names collected over the telephone channel. The approach proposed by [31] is not truly a hybrid, since it uses a standard HMM (trained with Viterbi algorithm on the ML criterion), which is used in parallel with connectionist models by combining the estimates of the emission probabilities (likelihoods) provided by the HMM with the normalized scores obtained with the ANNs. The linear combination scheme is the following:…”
Section: Other Approachesmentioning
confidence: 99%
“…Parallel ANN/HMM state-probability estiamtes [31] HMM estimates of the emission probabilities are linearly combined with scores obtained with a hierarchical mixture of TDNNs.…”
Section: Modelmentioning
confidence: 99%
“…A part of the prosodic information is 88 obviously linguistic, but the rest of it conveys non-linguistic information. In our flowchart this branch is fuzzier than the other: levels and units are less clearly defined.…”
Section: Hierarchical Organization Of Speech Perceptionmentioning
confidence: 99%