SUMMARYA new consonant r e c o g n i t i o n method is proposed which i n t e g r a t e s two s t o c h a s t i c methods: d i s c r i m i n a n t a n a l y s i s and HMM. Discriminant a n a l y s i s i s e f f e c t i v e i n analyzi n g l o c a l p a t t e r n s b u t i t assumes p r e c i s e d e t e c t i o n of r e f e r e n c e p o i n t s . HMM i s a b l e t o e x t r a c t t h e o v e r a l l dynamic f e a t u r e s and needs no e x p l i c i t segmentation of speech. However, i t l a c k s t h e a b i l i t y t o d i s c r i m i na t e between similar consonants. The method h e r e i n c o n s t r u c t s HMM w i t h d i s c r i m i n a n t analy s i s f r o n t end and r e c o g n i z e s consonants by combining t h e s c o r e o b t a i n e d by d i s c r i m i n a n t a n a l y s i s and t h a t by HMM. For a l l t h e Japanese consonants, t h i s i n t e g r a t e d method achieved t h e r e c o g n i t i o n r a t e of 92.1 perc e n t , which is h i g h e r by 5 t o 15 p e r c e n t than t h e c a s e u s i n g e i t h e r of t h e two methods alone.
I n t r o d u c t i o nTo r e a l i z e large-vocabulary speakerindependent speech r e c o g n i t i o n , phonemebased r e c o g n i t i o n i s d e s i r a b l e . T h e r e f o r e , w e are s t u d y i n g r e c o g n i t i o n of consonants which, due t o t h e i r dynamic f e a t u r e s , i s more d i f f i c u l t t h a n t h a t of vowels.There are many approaches t o consonant r e c o g n i t i o n :t h e rule-based method [ l ] , DP matching [ 2 ] , s t a t i s t i c a l method [ 3 ] , Markov modelings [ 4 ] , and n e u r a l networks [ 5 ] .Among them, t h e s t a t i s t i c a l o r probab i l i s t i c method i s advantageous because i t can avoid e x t r a c t i n g e x p l i c i t d i s t i n c t i v e f e a t u r e s and realizes a simple and f l e x i b l e 84 i n t e r f a c e w i t h t h e n a t u r a l language processi n g u n i t .
D i s c r i m i n a n t a n a l y s i s , which i s one of t h e m u l t i v a r i a t e s t a t i s t i c a l a n a l y s e s , i s s u i t a b l e t o d i s c r i m i n a t e l o c a l p a t t e r n s around t h e r e f e r e n c e p o i n t of a consonant such as a b u r s t p o i n t o r a s t a r t i n g p o i n t of f r i c t i o n . This method assumes t h e e x a c t d e t e c t i o n of t h er e f e r e n c e p o i n t of consona n t s , and such a p r e c i s e segmentation of speech i s extremely d i f f i c u l t .
On t h e o t h e r hand, HMM (Hidden Markov Model) i s a b l e t o e x t r a c t g l o b a l dynamicf e a t u r e s of a consonant from t h e p r e c e d i n g vowel t o t h e f o l l o w i n g vowel and i t does n o t r e q u i r e p r e c i s e segmentation. However, w i t h c o n v e n t i o n a l HMM, i t i s d i f f i c u l t t o d i sc r i m i n a t e a c o u s t i c a l l y similar consonants because q u a n t i z i n g i n p u t p a t t e r n v e c t o r s c a u s e s l o s s of d i s c r i m i n a n t i n f o r m a t i o n , and t h e l e a r n i n g a l g o r i t h m s d...