Yen-Ming Hsu scite author profile

Yen-Ming Hsu

4Publications

21Citation Statements Received

45Citation Statements Given

How they've been cited

How they cite others

Affiliations

National Formosa University

Publications

Order By: Most citations

Developments of Machine Learning Schemes for Dynamic Time-Wrapping-Based Speech Recognition

Ding

Yen

Hsu

2013

Mathematical Problems in Engineering

View full text Add to dashboard Cite

This paper presents a machine learning scheme for dynamic time-wrapping-based (DTW) speech recognition. Two categories of learning strategies, supervised and unsupervised, were developed for DTW. Two supervised learning methods, incremental learning and priority-rejection learning, were proposed in this study. The incremental learning method is conceptually simple but still suffers from a large database of keywords for matching the testing template. The priority-rejection learning method can effectively reduce the matching time with a slight decrease in recognition accuracy. Regarding the unsupervised learning category, an automatic learning approach, called "most-matching learning, " which is based on priority-rejection learning, was developed in this study. Most-matching learning can be used to intelligently choose the appropriate utterances for system learning. The effectiveness and efficiency of all three proposed machine-learning approaches for DTW were demonstrated using keyword speech recognition experiments.

show abstract

An HMM-Like Dynamic Time Warping Scheme for Automatic Speech Recognition

Ding

Hsu

2014

Mathematical Problems in Engineering

View full text Add to dashboard Cite

In the past, the kernel of automatic speech recognition (ASR) is dynamic time warping (DTW), which is feature-based template matching and belongs to the category technique of dynamic programming (DP). Although DTW is an early developed ASR technique, DTW has been popular in lots of applications. DTW is playing an important role for the known Kinect-based gesture recognition application now. This paper proposed an intelligent speech recognition system using an improved DTW approach for multimedia and home automation services. The improved DTW presented in this work, called HMM-like DTW, is essentially a hidden Markov model- (HMM-) like method where the concept of the typical HMM statistical model is brought into the design of DTW. The developed HMM-like DTW method, transforming feature-based DTW recognition into model-based DTW recognition, will be able to behave as the HMM recognition technique and therefore proposed HMM-like DTW with the HMM-like recognition model will have the capability to further perform model adaptation (also known as speaker adaptation). A series of experimental results in home automation-based multimedia access service environments demonstrated the superiority and effectiveness of the developed smart speech recognition system by HMM-like DTW.

show abstract

Efficient Speech Recognition Techniques for the Finals of Mandarin Syllables

Hwang¹,

Hsu²,

Wang³

et al. 1988

Int. J. Patt. Recogn. Artif. Intell.

View full text Add to dashboard Cite

A long-term research project toward Mandarin speech recognition techniques for very large vocabulary and unlimited text is considered. By carefully examining the special structures of Chinese language, the first-stage goal is set to be the design of efficient techniques to recognize the finals of Mandarin syllables. In this paper, three special approaches to do this are proposed. The Segmental Model Approach defines the final models by dividing the finals into several segments according to the acoustic structures of the speech signals. The Three-pass Approach uses three consecutive passes to classify the finals into small sets and improve the recognition efficiency. The Multi-section Vector Quantization (MSVQ) Approach, on the other hand, significantly reduces the necessary computation time by incorporating the branch-and-bound algorithm and common codebook concept with the MSVQ techniques. Extensive computer simulations are performed first to optimize each approach by choosing the best set of parameters then to compare the performance of the three approaches. It was found that all the three approaches are very efficient in terms of relatively high recognition rate and short computation time, and the MSVQ Approach provides the highest recognition rate at the shortest computation time, thus it is most attractive.

show abstract

A machine learning approach to dynamic programming for stochastic process of speech recognition

Ding¹,

Yen²,

Hsu³

2013

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yen-Ming Hsu

Developments of Machine Learning Schemes for Dynamic Time-Wrapping-Based Speech Recognition

An HMM-Like Dynamic Time Warping Scheme for Automatic Speech Recognition

Efficient Speech Recognition Techniques for the Finals of Mandarin Syllables

A machine learning approach to dynamic programming for stochastic process of speech recognition

Contact Info

Product

Resources

About