A long-term research project toward Mandarin speech recognition techniques for very large vocabulary and unlimited text is considered. By carefully examining the special structures of Chinese language, the first-stage goal is set to be the design of efficient techniques to recognize the finals of Mandarin syllables. In this paper, three special approaches to do this are proposed. The Segmental Model Approach defines the final models by dividing the finals into several segments according to the acoustic structures of the speech signals. The Three-pass Approach uses three consecutive passes to classify the finals into small sets and improve the recognition efficiency. The Multi-section Vector Quantization (MSVQ) Approach, on the other hand, significantly reduces the necessary computation time by incorporating the branch-and-bound algorithm and common codebook concept with the MSVQ techniques. Extensive computer simulations are performed first to optimize each approach by choosing the best set of parameters then to compare the performance of the three approaches. It was found that all the three approaches are very efficient in terms of relatively high recognition rate and short computation time, and the MSVQ Approach provides the highest recognition rate at the shortest computation time, thus it is most attractive.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.