16Due to their advantages over conventional n-gram language models, recurrent neural network language models (RNNLMs) recently 17 have attracted a fair amount of research attention in the speech recognition community. In this paper, we explore one advantage of 18 RNNLMs, namely, the ease with which they allow the integration of additional knowledge sources. We concentrate on features that provide 19 complementary information w.r.t. the lexical identities of the words. We refer to such information as meta-information. We single out 20 three cases and investigate their merits by means of N-best list re-scoring experiments on a challenging corpus of spoken Dutch (referred 21 to as CGN) as well as on the English Wall Street Journal (WSJ) corpus. First, we look at Parts of Speech (POS) tags and lemmas, two 22 sources of word-level linguistic information that are known to make a contribution to the performance of conventional language models. 23 We confirm that RNNLMs can benefit from these sources as well. Second, we investigate socio-situational settings (SSSs) and topics, two 24 sources of discourse-level information that are also known to benefit language models. SSSs are present in the CGN data, and can be seen as 25 a proxy for the language register. For the purposes of our investigation, we assume that information on the SSS can be captured at the 26 moment at which speech is recorded. Topics, i.e., treatments of different subjects, are present in the WSJ data. In order to predict POS, 27 lemmas, SSS and topic, a second RNNLM is coupled to the main RNNLM. We refer to this architecture as a recurrent neural network tandem 28 language model (RNNTLM). Our experimental findings show that if high-quality meta-information labels are available, both word-level 29 and discourse-level information improve performance of language models. Third, we investigate sentence length and word length 30 (i.e., token size), two sources of intrinsic information that are readily available for exploitation because they are known at the time of 31 re-scoring. Intrinsic information has been largely overlooked by language modeling research. The results of both experiments on CGN 32 data and WSJ data show that integrating sentence length and word length can achieve improvement. RNNLMs allow these features to 33 be incorporated with ease, and obtain improved performance. 34