This document presents methods to remove the initialization or burn-in bias from Markov chain Monte Carlo (MCMC) estimates, with consequences on parallel computing, convergence diagnostics and performance assessment. The document is written as an introduction to these methods for MCMC users. Some theoretical results are mentioned, but the focus is on the methodology.
Iterated filtering algorithms are stochastic optimization procedures for latent variable models that recursively combine parameter perturbations with latent variable reconstruction. Previously, theoretical support for these algorithms has been based on the use of conditional moments of perturbed parameters to approximate derivatives of the log likelihood function. Here, a theoretical approach is introduced based on the convergence of an iterated Bayes map. An algorithm supported by this theory displays substantial numerical improvement on the computational challenge of inferring parameters of a partially observed Markov process. sequential Monte Carlo | particle filter | maximum likelihood | Markov processA n iterated filtering algorithm was originally proposed for maximum likelihood inference on partially observed Markov process (POMP) models by Ionides et al. (1). Variations on the original algorithm have been proposed to extend it to general latent variable models (2) and to improve numerical performance (3,4). In this paper, we study an iterated filtering algorithm that generalizes the data cloning method (5, 6) and is therefore also related to other Monte Carlo methods for likelihood-based inference (7-9). Data cloning methodology is based on the observation that iterating a Bayes map converges to a point mass at the maximum likelihood estimate. Combining such iterations with perturbations of model parameters improves the numerical stability of data cloning and provides a foundation for stable algorithms in which the Bayes map is numerically approximated by sequential Monte Carlo computations.We investigate convergence of a sequential Monte Carlo implementation of an iterated filtering algorithm that combines data cloning, in the sense of Lele et al. (5), with the stochastic parameter perturbations used by the iterated filtering algorithm of (1). Lindström et al. (4) proposed a similar algorithm, termed fast iterated filtering, but the theoretical support for that algorithm involved unproved conjectures. We present convergence results for our algorithm, which we call IF2. Empirically, it can dramatically outperform the previous iterated filtering algorithm of ref. 1, which we refer to as IF1. Although IF1 and IF2 both involve recursively filtering through the data, the theoretical justification and practical implementations of these algorithms are fundamentally different. IF1 approximates the Fisher score function, whereas IF2 implements an iterated Bayes map. IF1 has been used in applications for which no other computationally feasible algorithm for statistically efficient, likelihoodbased inference was known (10-15). The extra capabilities offered by IF2 open up further possibilities for drawing inferences about nonlinear partially observed stochastic dynamic models from time series data.Iterated filtering algorithms implemented using basic sequential Monte Carlo techniques have the property that they do not need to evaluate the transition density of the latent Markov process.Algorithms with this property...
Inference for partially observed Markov process models has been a longstanding methodological challenge with many scientific and engineering applications. Iterated filtering algorithms maximize the likelihood function for partially observed Markov process models by solving a recursive sequence of filtering problems. We present new theoretical results pertaining to the convergence of iterated filtering algorithms implemented via sequential Monte Carlo filters. This theory complements the growing body of empirical evidence that iterated filtering algorithms provide an effective inference strategy for scientific models of nonlinear dynamic systems. The first step in our theory involves studying a new recursive approach for maximizing the likelihood function of a latent variable model, when this likelihood is evaluated via importance sampling. This leads to the consideration of an iterated importance sampling algorithm which serves as a simple special case of iterated filtering, and may have applicability in its own right.Comment: Published in at http://dx.doi.org/10.1214/11-AOS886 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org
A large number of statistical models are "doubly-intractable": the likelihood normalising term, which is a function of the model parameters, is intractable, as well as the marginal likelihood (model evidence). This means that standard inference techniques to sample from the posterior, such as Markov chain Monte Carlo (MCMC), cannot be used. Examples include, but are not confined to, massive Gaussian Markov random fields, autologistic models and Exponential random graph models. A number of approximate schemes based on MCMC techniques, Approximate Bayesian computation (ABC) or analytic approximations to the posterior have been suggested, and these are reviewed here. Exact MCMC schemes, which can be applied to a subset of doubly-intractable distributions, have also been developed and are described in this paper. As yet, no general method exists which can be applied to all classes of models with doubly-intractable posteriors.In addition, taking inspiration from the Physics literature, we study an alternative method based on representing the intractable likelihood as an infinite series. Unbiased estimates of the likelihood can then be obtained by finite time stochastic truncation of the series via Russian Roulette sampling, although the estimates are not necessarily positive. Results from the Quantum Chromodynamics literature are exploited to allow the use of possibly negative estimates in a pseudo-marginal MCMC scheme such that expectations with respect to the posterior distribution are preserved. The methodology is reviewed on well-known examples such as the parameters in Ising models, the posterior for Fisher-Bingham distributions on the d-Sphere and a large-scale Gaussian Markov Random Field model describing the Ozone Column data. This leads to a critical assessment of the strengths and weaknesses of the methodology with pointers to ongoing research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.