An interaction has been found between the true source language model, training language model, and the testing language model. This interaction has implications for vocabulary independent modeling, testing methodologies, discriminative training, and the adequacy of our current databases for continuous speech recognition (CSR) development. The current DARPA databases suffer from the described difficulties which suggests that new CSR databases are needed if we are to further advance the state-of-the-art. The Interaction During Training When a category model (e.g. a context-free (CF) model such as a monophone) is used to a model a set of subcategories (e.g. context-dependent (CD) models such as triphones), the category model becomes the subcategory prior-probability weighted average of the subcategory models: Meat E PsubeatMsubcat where M denotes a model. (The mathematics used here are intended to be conceptual rather than rigorous. Thus models will be considered to be averages. In practice, the method for deriving a model from a set of sub-models or observations is highly dependent upon the form of model used.) In a field, such as speech recognition, where models are trained from exemplars, the subcategory model will generally be: N 1 Msttbcat = ~ ~.= Osubeat,i where 08=bcat,i is an observation emitted from the subcategory. Mcat combines both the subcategory models and the prior-probability of the subcategories and similarly Msubcat combines the observations and their (sampled) prior-probabilities.