Performance of automatic speech recognition systems trained on close-talking data suffers when the systems are used in a distant-talking environment due to the mismatch in training and testing conditions. Microphone array sound capture can remove some of the mismatch by removing ambient noise and reverberation, resulting in an approximation to a clean speech signal. However, this often does not improve the performance sufficiently. But, using array signal capture in conjunction with Hidden Markov Model (HMM) adaptation on the clean-speech models can result in high recognition accuracy. This paper describes an experiment in which the output of an eight-element microphone array system using MFA processing is used for speech recognition with LT-MLLR adaptation. The recognition is done in two passes. In the first pass, an HMM trained on clean data is used to recognize the speech. Using the results of this pass, the HMM model is adapted to the environment using the LT-MLLR algorithm. This adapted model is then used to recognize the speech. It is shown that the use of MFA and LT-MLLR results in high-accuracy recognition. [Work supported by DARPA Contract DABT63-93-C-0037.] a)Currently at Raytheon Systems Company, Falls Church, VA.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.