“…Therefore, the system will be better prepared for working in noisy environments, since it is able to cope not only with spelling errors, but also with out-of-vocabulary words and spelling, morphological or even historical variants Lee and Ahn, 1996;Mustafa and Al-Radaideh, 2004), in contrast with classical conflation techniques based on stemming, lemmatization or morphological analysis, which are negatively affected by these phenomena. This feature is extremely valuable, not only for regular text retrieval tasks, but also for specialized tasks such as spoken document retrieval (SDR) (Ng et al, 2000), or cross-lingual information retrieval (CLIR) over closely-related languages using no translation, but only cognate matching 4 (McNamee and Mayfield, 2004a). The third major factor for the success of n-grams in IR applications comes from their inherent language-independent nature.…”