Network-based connected digit recognition

Bush, Marcia A.; Kopec, Gary E.

doi:10.1109/tassp.1987.1165057

Cited by 35 publications

(10 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Nevertheless, they are in reasonable agreement to the formant ranges reported in [1, p. 132], [10, p. 51], [5], and [22]. Comparing the formant frequencies in Fig.…”

Section: A Examples Of Formant Estimatessupporting

confidence: 90%

“…Using (1), the prediction error can be rewritten as (4) with the autocorrelation coefficients of segment for , 1, 2 (5) By minimizing the prediction error as given by (4) with respect to and , we obtain the following optimum prediction coefficients [19, p. 568…”

Section: A Second-order Resonatormentioning

confidence: 99%

“…A systematic evaluation of the method on the complete adult corpus of the TI digit string data base [18] is carried out. Formants have also been estimated on the same database in [5] and [17]. We use the estimated formant contours to perform systematic recognition experiments on the TI digit string data base.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Formant estimation for speech recognition

Welling

Ney

1998

IEEE Trans. Speech Audio Process.

View full text Add to dashboard Cite

Abstract-This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is modeled by a set of digital resonators connected in parallel. An algorithm based on dynamic programming produces both the model parameters and the segment boundaries that optimally match the spectrum.We used this method in experimental tests that were carried out on the TI digit string data base. The main results of the experimental tests are: 1) the presented approach produces reliable estimates of formant frequencies across a wide range of sounds and speakers; and 2) the estimated formant frequencies were used in a number of variants for recognition. The best set-up resulted in a string error rate of 4.2% on the adult corpus of the TI digit string data base.

show abstract

“…Nevertheless, they are in reasonable agreement to the formant ranges reported in [1, p. 132], [10, p. 51], [5], and [22]. Comparing the formant frequencies in Fig.…”

Section: A Examples Of Formant Estimatessupporting

confidence: 90%

Section: A Second-order Resonatormentioning

confidence: 99%

See 1 more Smart Citation

Formant estimation for speech recognition

Welling

Ney

1998

IEEE Trans. Speech Audio Process.

View full text Add to dashboard Cite

show abstract

“…A stochastic segment model has previously been used for speaker-dependent phoneme and word recognition, demonstrating that a segment model outperformed a discrete hidden Markov model when both models were context-independent [Ostendorf and Roucos 1989]. Other segment-based models have also showed encouraging results in speakerindependent applications [Bush and Kopec 1987, Bocchieri and Doddington 1986, Makino and Kido 1986, Zue et al 1989.…”

Section: Jx :Y= Taxmentioning

confidence: 99%

Improvements in the stochastic segment model for Phoneme recognition

Digalakis

Ostendorf

Rohlíček³

1989

Proceedings of the Workshop on Speech and Natural Language - HLT '89

View full text Add to dashboard Cite

The heart of a speech recognition system is the acoustic model of sub-word units (e.g., phonemes). In this work we discuss refinements of the stochastic segment model, an alternative to hidden Markov models for representation of the acoustic variability of phonemes. We concentrate on mechanisms for better modelling time correlation of features across an entire segment. Results are presented for speaker-independent phoneme classification in continuous speech based on the 'lIMIT 0a!~base.

show abstract

“…Using a network that uses acoustic-phonetic features, Bush and Kopec (1987) achieved an accuracy of 96%. Rabiner et al (1988) achieved an accuracy of 97.1% using a hidden Markov model.…”

Section: 7mentioning

confidence: 99%

Speaker-Independent Digit Recognition Using a Neural Network with Time-Delayed Connections

Unnikrishnan¹,

Hopfield

Tank³

1992

Neural Computation

View full text Add to dashboard Cite

The capability of a small neural network to perform speaker-independent recognition of spoken digits in connected speech has been investigated. The network uses time delays to organize rapidly changing outputs of symbol detectors over the time scale of a word. The network is data driven and unclocked. To achieve useful accuracy in a speakerindependent setting, many new ideas and procedures were developed. These include improving the feature detectors, self-recognition of word ends, reduction in network size, and dividing speakers into natural classes. Quantitative experiments based on Texas Instruments (TI) digit data bases are described.

show abstract

Network-based connected digit recognition

Cited by 35 publications

References 30 publications

Formant estimation for speech recognition

Formant estimation for speech recognition

Improvements in the stochastic segment model for Phoneme recognition

Speaker-Independent Digit Recognition Using a Neural Network with Time-Delayed Connections

Contact Info

Product

Resources

About