This paper presents an accurate and efficient voice retrieval system for very-large-vocabulary Chinese textual databases with a specially-designed clustered language model. To reduce the problems resulted from the complexity of unconstrained speechinput queries for retrieval, the system is completely syllablebased in both speech recognition and database retrieval by properly utilizing the mono-syllabic structure of Chinese language. In addition, it partitions the records in the database into clusters and trains the clustered language model using the clustering results. The proposed clustered language model with its augmented search algorithm are very useful to improve accuracy and speed of the speech retrieval system. In the preliminary tests using an experimental database with about 30,000 bibliographical records, it was found that the present system can accept unconstrained speech-input queries and achieve very good performance.
Golderr Mandarin (11) is an intelligent single-chip based real-time Mandarin dictation machine for Chinese language with v e q large vocabulary for the input of unlimited Chinese tezts into computers using voice. Thw dictation machine can be installed on any personal computer, in which only a single chip Motorola DSP 960020 is wed, with a preliminary character correct rate around 95% at a speed of 0.6 sec per character. Variow adaptation/ leaming function, have been developed for this machine, including fast adaptation to new speakers, on-line leaming the voice characteristics, tark domains, word patterm and noise environments of the wers, so the machine can be earilypersonalized for each wer. These adaptation/ leaming functions will be the major subjects of this pa?er. '1. Dept. of Zlecrrical Eneneering, National Taiwan Univer-2. Dept. of Computer Science and Information Enginccying,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.