Evaluation of various parameter sets in spoken digits recognition

Ichikawa, Akira; Nakano, Y.; Nakata, Kiyotomo

doi:10.1109/tau.1973.1162480

Cited by 18 publications

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Use of generalized cepstral distance measure in isolated word recognition

Kobayashi

Kondo

Imai

1989

Electron Comm Jpn Pt III

View full text Add to dashboard Cite

This paper proposes a generalized cepstral discance measure based on minimum phase spictral models and shows the performance of the distance measure for isolated word recognition. the generalized cepstral distance measure approximates the L2 norm between two spectra on a fractional power magnitude scale. the measure is equal to the cepstral distance measure for the special case. Calculation of the distance can be performed efficiently from cepstral coefficients or linear predictor coefficients. Isolated word recognition experiments using a vocabulary of 20 highly confusable Japanese city names indicate that the generalized cepstral distance measure gives higher recognition accuracy than conventional distance measures. Three analysis parameter sets, which are FFT cepstral coefficients, linear predictor coefficients, and improved cepstral coefficients, are also compared in terms of their effects on the recognition performance.

show abstract

Use of generalized cepstral distance measure in isolated word recognition

Kobayashi

Kondo

Imai

1989

Electron Comm Jpn Pt III

View full text Add to dashboard Cite

show abstract

Evaluation of the smoothed group delay spectrum distance measure for speaker‐dependent speech recognition

Umezaki

Singer

Itakura

1991

Electron Comm Jpn Pt III

View full text Add to dashboard Cite

This paper evaluates first the smoothed group delay spectrum (SGDS) distance measure through the isolated work speech recognition experiment by specified speakers. The experiment was performed for the following three cases, considering the speech recognition in the actual environment: 1) the case where the channels have difference characteristics; 2) the case where a white noise is added to the input speech; and 3) the case where the telephone speech is used as the input. In all three cases, the recognition rate is improved drastically compared to the traditional LPC cepstrum distance measure. An improvement of the recognition rate by 16 percent was realized under the noise of segmental SN ratio 20 dB. Then the distance measure is evaluated for the case where the FFT cepstrum is converted into the group delay spectrum. The proposed method gives a better recognition rate compared to the conventional FFT cepstrum distance measure, but the result is worse than the SGDA measure by approximately 3 percent since the higher‐order FFT cepstrum coefficient has a larger variance on the time axis. Finally, the SGDS distance measure is evaluated by the isolated word speech recognition system with the monosyllable as the registered speech. The vowel recognition rate is improved, which improved the recognition rates for the syllable and the word by 2 percent or more on a relative scale.

show abstract

Speech input/output system employing a minicomputer

Fujisawa

Shirai

1974

Electrical Engineering Japan

View full text Add to dashboard Cite

Evaluation of various parameter sets in spoken digits recognition

Cited by 18 publications

References 3 publications

Use of generalized cepstral distance measure in isolated word recognition

Use of generalized cepstral distance measure in isolated word recognition

Evaluation of the smoothed group delay spectrum distance measure for speaker‐dependent speech recognition

Speech input/output system employing a minicomputer

Contact Info

Product

Resources

About