Abstract. This paper provides a new method for multi-topic Bayesian analysis for microarray data. Our method achieves a further maximization of lower bounds in a marginalized variational Bayesian inference (MVB) for Latent Process Decomposition (LPD), which is an effective probabilistic model for microarray data. In our method, hyperparameters in LPD are updated by empirical Bayes point estimation. The experiments based on microarray data of realistically large size show efficiency of our hyperparameter reestimation technique. . We can also apply an LDA-like Bayesian multi-topic analysis to microarray data, where we regard samples as documents and genes as words. However, microarray data are given as a real matrix, not as a non-negative integer matrix. Therefore, researchers apply LDA after introducing Gaussian distributions in place of word multinomial distributions and provide an efficient probabilistic model, Latent Process Decomposition (LPD) [9], where topics in LDA are called processes. As we can find Dirichlet prior distributions for word multinomials in LDA, we can find prior distributions for Gaussian distributions in LPD. To be precise, Gaussian priors are prepared for mean parameters, and Gamma priors are for precision parameters. However, as far as we know, there are still no reports on how we can reestimate hyperparameters, i.e., parameters of these prior distributions, and there are also no reports on whether we can improve microarray analysis by using hyperparameter reestimation. Therefore, in this paper, we provide a hyperparameter reestimation technique for LPD and show the results of experiments using microarray data of realistically large size.
IntroductionOur method is based on a marginalized variational Bayesian inference (MVB) proposed by Ying et al. [15]. Marginalized variational Bayesian inference, alternatively called collapsed variational Bayesian inference [12], theoretically achieves better lower bounds than conventional variational Bayesian inferences