Speech coders with bitrates as low as 2.4 kbits/s are now being developed for speech transmission in the telecommunications industry. For speech coders to work at this reduced bitrate, some speech information has to be removed and it is only natural to expect that the performance of speech recognition systems will deteriorate when coded speech is applied as input to a recognition system. In this paper, the results of a study to examine the effects speech coders have on speech recogntion are presented. Six different speech coders ranging from 4.8 kbits/s to 40 kbits/s are used with two different speech recognition systems 1) isolated word recogntion and 2) phoneme recogntion from continuous speech. The effects on speech recognition performance by tandeming each of the speech coders are also presented.
Speech recognition systems work reasonably well in laboratory conditions, but their performance deteriorates drastically when they are deployed in practical situations where the speech is corrupted by additive noise. One way to improve the performance of a speech recognition system in the presence of noise, is to enhance the speech prior to its recognition. Two singular value decomposition based techniques have been recently proposed for speech enhancement [5] [6]. In these techniques, singular value decomposition has been applied to an over-determined, over-extended data matrix formed from the noisy speech signal. A noise-free, low rank approximation was obtained by retaining a specific number of singular values. This technique was applied here as a preprocessor for recognising speech in the presence of noise. It was found to improve the recognition performance significantly for signal-to-noise ratios less than 15dB.
Speech recognition systems work reasonably well in laboratory conditions, but their performance deteriorates drastically when they are deployed in practical situations where the speech is corrupted by additive noise. One way to improve the performance of a speech recognition system in the presence of noise, is to enhance the speech prior to its recognition. Two singular value decomposition based techniques have been recently proposed for speech enhancement [5] [6]. In these techniques, singular value decomposition has been applied to an over-determined, over-extended data matrix formed from the noisy speech signal. A noise-free, low rank approximation was obtained by retaining a specific number of singular values. This technique was applied here as a preprocessor for recognising speech in the presence of noise. It was found to improve the recognition performance significantly for signal-to-noise ratios less than 15dB.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.