This paper presents a machine learning scheme for dynamic time-wrapping-based (DTW) speech recognition. Two categories of learning strategies, supervised and unsupervised, were developed for DTW. Two supervised learning methods, incremental learning and priority-rejection learning, were proposed in this study. The incremental learning method is conceptually simple but still suffers from a large database of keywords for matching the testing template. The priority-rejection learning method can effectively reduce the matching time with a slight decrease in recognition accuracy. Regarding the unsupervised learning category, an automatic learning approach, called "most-matching learning, " which is based on priority-rejection learning, was developed in this study. Most-matching learning can be used to intelligently choose the appropriate utterances for system learning. The effectiveness and efficiency of all three proposed machine-learning approaches for DTW were demonstrated using keyword speech recognition experiments.
In the past, the kernel of automatic speech recognition (ASR) is dynamic time warping (DTW), which is feature-based template matching and belongs to the category technique of dynamic programming (DP). Although DTW is an early developed ASR technique, DTW has been popular in lots of applications. DTW is playing an important role for the known Kinect-based gesture recognition application now. This paper proposed an intelligent speech recognition system using an improved DTW approach for multimedia and home automation services. The improved DTW presented in this work, called HMM-like DTW, is essentially a hidden Markov model- (HMM-) like method where the concept of the typical HMM statistical model is brought into the design of DTW. The developed HMM-like DTW method, transforming feature-based DTW recognition into model-based DTW recognition, will be able to behave as the HMM recognition technique and therefore proposed HMM-like DTW with the HMM-like recognition model will have the capability to further perform model adaptation (also known as speaker adaptation). A series of experimental results in home automation-based multimedia access service environments demonstrated the superiority and effectiveness of the developed smart speech recognition system by HMM-like DTW.
A long-term research project toward Mandarin speech recognition techniques for very large vocabulary and unlimited text is considered. By carefully examining the special structures of Chinese language, the first-stage goal is set to be the design of efficient techniques to recognize the finals of Mandarin syllables. In this paper, three special approaches to do this are proposed. The Segmental Model Approach defines the final models by dividing the finals into several segments according to the acoustic structures of the speech signals. The Three-pass Approach uses three consecutive passes to classify the finals into small sets and improve the recognition efficiency. The Multi-section Vector Quantization (MSVQ) Approach, on the other hand, significantly reduces the necessary computation time by incorporating the branch-and-bound algorithm and common codebook concept with the MSVQ techniques. Extensive computer simulations are performed first to optimize each approach by choosing the best set of parameters then to compare the performance of the three approaches. It was found that all the three approaches are very efficient in terms of relatively high recognition rate and short computation time, and the MSVQ Approach provides the highest recognition rate at the shortest computation time, thus it is most attractive.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.