This paper discusses a hidden Markov model (HMM) based on multi-space probability distribution (MSD). The H M M s are widelyused statistical models to characterize the sequence of speech spectra and have successfully been applied to speech recognition system. From these facts, it is considered that the HMM is useful for modeling pitch patterns of speech. However, we cannot apply the conventional discrete or continuous H M M s to pitch pattem modeling since the observation sequence of pitch pattem is composed of one-dimensional continuous values and a discrete symbol which represents "unvoiced". MSD-HMM includes discrete HMM and continuous mixture HMM as special cases, and further can model the sequence of observation vectors with variable dimension including zero-dimensional observations, i.e., discrete symbols. As a result, MSD-HMMs can model pitch pattems without heuristic assumption. We derive a reestimation algorithm for the extended HMM and show that it can find a critical point of the likelihood function.
This paper proposes a method for incrementally understanding user utterances whose semantic boundaries are not known and responding in real time even before boundaries are determined. It is an integrated parsing and discourse processing method that updates the partial result of understanding word by word, enabling responses based on the partial result. This method incrementally finds plausible sequences of utterances that play crucial roles in the task execution of dialogues, and utilizes beam search to deal with the ambiguity of boundaries as well as syntactic and semantic ambiguities. The results of a preliminary experiment demonstrate that this method understands user utterances better than an understanding method that assumes pauses to be semantic boundaries.
SUMMARYA scheme for simultaneously modeling and generating a pitch pattern and a spectral sequence on the basis of a hidden Markov model (HMM) is presented. Since a pitch pattern is expressed as a time series of voiced intervals taking continuous values and voiceless intervals without values, it cannot be modeled by the usual HMM. This paper proposes a scheme for modeling a pitch and a spectrum integrally with characteristic parameters that combine pitch parameters and spectral parameters by applying an HMM based on a multispace probability distribution (multispace probability distribution HMM: MSD-HMM). In addition, a context clustering scheme based on decision trees in the MSD-HMM is derived, and a scheme for constructing the model while taking account of the variation factors of the pitch and the spectrum is presented. In addition, it is shown that pitch patterns and spectral sequences approximating real voice can be generated by using the parameter generation scheme based on the maximum likelihood criterion.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.