9Deep sequencing of transposon mutant libraries (or TnSeq) is a powerful method for probing essentiality 10 of genomic loci under different environmental conditions. Various analytical methods have been described 11 for identifying conditionally essential genes whose tolerance for insertions varies between two conditions. 12 However, for large-scale experiments involving many conditions, it would be useful to have a method for 13 identifying genes that exhibit significant variability in insertions across multiple conditions. In this paper, 14 we introduce a novel statistical method for identifying genes with significant variability of insertion counts 15 across multiple conditions based on Zero-Inflated Negative Binomial (ZINB) regression. Using likelihood 16 ratio tests, we show that the ZINB fits TnSeq data better than either ANOVA or a Negative Bionomial (as a 17 generalized linear model). We use ZINB regression to identify genes required for infection of M. tuberculosis 18 H37Rv in C57BL/6 mice. We also use ZINB to perform a restrospective analysis of genes conditionally 19 essential in H37Rv cultures exposed to multiple antibiotics. Our results show that, not only does ZINB 20 generally identify most of the genes found by pairwise resampling (and vastly out-performs ANOVA), but it 21 also identifies additional genes where variability is detectable only when the magnitudes of insertion counts 22 are treated separately from local differences in saturation, as in the ZINB model. 23 1 Introduction 24Deep sequencing of transposon mutant libraries (or TnSeq) is a powerful method for probing essentiality 25 of genomic loci under different environmental conditions [1]. In a transposon (Tn) mutant library (such as 26 1 made with the Himar1 transposon), insertions generally occur at random locations throughout the genome 27 (restricted to TA dinucelotides for Himar1 [2]). The absence (or reduction) of insertions in a locus is used 28 to infer conditional essentiality, based on killing (or growth impairment) that depletes those clones from 29 the population. While the abundance of clones with insertions at different sites can be profiled efficiently 30 through deep sequencing, there are a number of sources of noise that induce a high degree of variability 31 in insertion counts at each site, including: variations in mutant abundance during library construction, 32 stochastic differences among samples, biases due to sample preparation protocol and sequencing technology, 33 and other effects. Previous statistical methods have been developed for quantitative assessment of essential 34 genes in single conditions, as well as pairwise comparisons of conditional essentiality. Statistical methods 35 for characterizing essential regions in a genome include those based on tests of sums of insertion counts 36 in genes [3], gaps [4], bimodality of empirical distributions [5], non-parametric tests of counts [6], Poisson 37 distributions [7], and Hidden Markov Models [8]. Statistical methods for evaluating conditional essen...