“…Statistical significance of over-representation of these word patterns provides valuable clues to biologists. Consequently, much work has been done on the use of asymptotic limiting distributions to approximate these pvalues (Prum et al, 1995;Reinert et al, 2000;Régnier, 2000;Robin et al, 2002;Huang et al, 2004;Leung et al, 2005;Mitrophanov and Borodovsky, 2006;Pape et al, 2008). However, the approximations may not be accurate for short words or for words consisting of repeats and most theoretical approximations work only in specific settings.…”