ABSTRACfThis paper describes an attempt to design a knowledge-baSed large vocabulary speech recognition system. Our motivation is to replace features based on the short·term spectra, such as Mel-frequency cep straI coefficients (MFCC), by features that explicitly represent some of the distinctive features of the speech signal. However, rather than att empting to cOmpute acoustic correlates of these distinctive fea tures, we have engineered an approach where neural networks are trained to map short-tenn spectral features to the posterior probabil ity of some distinctive features. These probabilities are then used as features in a large vocabulary tied-state HMM-based recognizer. Experimental results on the Wall Street Journal Task show that such a system, while not outperfonning a MFCC-based system, generates very different error patterns. After combining the results of a base line MFCC system with the results of several systems based on the proposed approach, we were able to obtain reductions in word error rates of 19% and 10 % on the 5K and 20K tasks respectively over our best MfCC-based systems.