Christopher Nemeth scite author profile

It is well known that Markov chain Monte Carlo (MCMC) methods scale poorly with dataset size. A popular class of methods for solving this issue is stochastic gradient MCMC (SGMCMC). These methods use a noisy estimate of the gradient of the log-posterior, which reduces the per iteration computational cost of the algorithm. Despite this, there are a number of results suggesting that stochastic gradient Langevin dynamics (SGLD), probably the most popular of these methods, still has computational cost proportional to the dataset size. We suggest an alternative log-posterior gradient estimate for stochastic gradient MCMC which uses control variates to reduce the variance. We analyse SGLD using this gradient estimate, and show that, under log-concavity assumptions on the target distribution, the computational cost required for a given level of accuracy is independent of the dataset size. Next, we show that a different control-variate technique, known as zero variance control variates, can be applied to SGMCMC algorithms for free. This postprocessing step improves the inference of the algorithm by reducing the variance of the MCMC output. Zero variance control variates rely on the gradient of the log-posterior; we explore how the variance reduction is affected by replacing this with the noisy gradient estimate calculated by SGMCMC.

show abstract

Stochastic Gradient Markov Chain Monte Carlo

Nemeth

Fearnhead

2021

Journal of the American Statistical Association

View full text Add to dashboard Cite

Markov chain Monte Carlo (MCMC) algorithms are generally regarded as the gold standard technique for Bayesian inference. They are theoretically well-understood and conceptually simple to apply in practice. The drawback of MCMC is that performing exact inference generally requires all of the data to be processed at each iteration of the algorithm. For large datasets, the computational cost of MCMC can be prohibitive, which has led to recent developments in scalable Monte Carlo algorithms that have a significantly lower computational cost than standard MCMC. In this article, we focus on a particular class of scalable Monte Carlo algorithms, stochastic gradient Markov chain Monte Carlo (SGMCMC) which utilizes data subsampling techniques to reduce the per-iteration cost of MCMC. We provide an introduction to some popular SGMCMC algorithms and review the supporting theoretical results, as well as comparing the efficiency of SGMCMC algorithms against MCMC on benchmark examples. The supporting R code is available online at https:// github.com/chris-nemeth/sgmcmc-review-paper.

show abstract

Sequential Monte Carlo Methods for State and Parameter Estimation in Abruptly Changing Environments

Nemeth

Fearnhead

Mihaylova

2014

IEEE Trans. Signal Process.

View full text Add to dashboard Cite

This paper develops a novel sequential Monte Carlo (SMC) approach for joint state and parameter estimation that can deal efficiently with abruptly changing parameters which is a common case when tracking maneuvering targets. The approach combines Bayesian methods for dealing with changepoints with methods for estimating static parameters within the SMC framework. The result is an approach which adaptively estimates the model parameters in accordance with changes to the target's trajectory. The developed approach is compared against the Interacting Multiple Model (IMM) filter for tracking a maneuvering target over a complex maneuvering scenario with nonlinear observations. In the IMM filter a large combination of models is required to account for unknown parameters. In contrast, the proposed approach circumvents the combinatorial complexity of applying multiple models in the IMM filter through Bayesian parameter estimation techniques. The developed approach is validated over complex maneuvering scenarios where both the system parameters and measurement noise parameters are unknown. Accurate estimation results are presented.Sequential Monte Carlo methods, joint state and parameter estimation, nonlinear systems, particle learning, tracking maneuvering targets.

show abstract

Particle Metropolis-adjusted Langevin algorithms

Nemeth¹,

Sherlock²,

Fearnhead³

2016

Biometrika

View full text Add to dashboard Cite

This paper proposes a new sampling scheme based on Langevin dynamics that is applicable within pseudo-marginal and particle Markov chain Monte Carlo algorithms. We investigate this algorithm's theoretical properties under standard asymptotics, which correspond to an increasing dimension of the parameters, n. Our results show that the behaviour of the algorithm depends crucially on how accurately one can estimate the gradient of the log target density. If the error in the estimate of the gradient is not sufficiently controlled as dimension increases, then asymptotically there will be no advantage over the simpler random-walk algorithm. However, if the error is sufficiently well-behaved, then the optimal scaling of this algorithm will be O(n −1/6 ) compared to O(n −1/2 ) for the random walk. Our theory also gives guidelines on how to tune the number of Monte Carlo samples in the likelihood estimate and the proposal step-size.

show abstract

Merging MCMC Subposteriors through Gaussian-Process Approximations

Nemeth¹,

Sherlock²

2018

Bayesian Anal.

View full text Add to dashboard Cite

Markov chain Monte Carlo (MCMC) algorithms have become powerful tools forBayesian inference. However, they do not scale well to large-data problems. Divide-andconquer strategies, which split the data into batches and, for each batch, run independent MCMC algorithms targeting the corresponding subposterior, can spread the computational burden across a number of separate computer cores. The challenge with such strategies is in recombining the subposteriors to approximate the full posterior. By creating a Gaussianprocess approximation for each log-subposterior density we create a tractable approximation for the full posterior. This approximation is exploited through three methodologies: firstly a Hamiltonian Monte Carlo algorithm targeting the expectation of the posterior density provides a sample from an approximation to the posterior; secondly, evaluating the true posterior at the sampled points leads to an importance sampler that, asymptotically, targets the true posterior expectations; finally, an alternative importance sampler uses the full Gaussian-process distribution of the approximation to the log-posterior density to re-weight any initial sample and provide both an estimate of the posterior expectation and a measure of the uncertainty in it.

show abstract

Particle Approximations of the Score and Observed Information Matrix for Parameter Estimation in State–Space Models With Linear Computational Cost

Nemeth¹,

Fearnhead²,

Mihaylova³

2016

Journal of Computational and Graphical Statistics

View full text Add to dashboard Cite

Poyiadjis et al. (2011) show how particle methods can be used to estimate both the score and the observed information matrix for state space models. These methods either suffer from a computational cost that is quadratic in the number of particles, or produce estimates whose variance increases quadratically with the amount of data. This paper introduces an alternative approach for estimating these terms at a computational cost that is linear in the number of particles. The method is derived using a combination of kernel density estimation, to avoid the particle degeneracy that causes the quadratically increasing variance, and Rao-Blackwellisation. Crucially, we show the method is robust to the choice of bandwidth within the kernel density estimation, as it has good asymptotic properties regardless of this choice.Our estimates of the score and observed information matrix can be used within both online and batch procedures for estimating parameters for state space models. Empirical results show improved parameter estimates compared to existing methods at a significantly reduced computational cost. Supplementary materials including code are available.

show abstract

Bayesian calibration of firn densification models

et al. 2020

View full text Add to dashboard Cite

Abstract. Firn densification modelling is key to understanding ice sheet mass balance, ice sheet surface elevation change, and the age difference between ice and the air in enclosed air bubbles. This has resulted in the development of many firn models, all relying to a certain degree on parameter calibration against observed data. We present a novel Bayesian calibration method for these parameters and apply it to three existing firn models. Using an extensive dataset of firn cores from Greenland and Antarctica, we reach optimal parameter estimates applicable to both ice sheets. We then use these to simulate firn density and evaluate against independent observations. Our simulations show a significant decrease (24 % and 56 %) in observation–model discrepancy for two models and a smaller increase (15 %) for the third. As opposed to current methods, the Bayesian framework allows for robust uncertainty analysis related to parameter values. Based on our results, we review some inherent model assumptions and demonstrate how firn model choice and uncertainties in parameter values cause spread in key model outputs.

show abstract

Preparing for an Influenza Pandemic: Hospital Acceptance Study of Filtering Facepiece Respirator Decontamination Using Ultraviolet Germicidal Irradiation

et al. 2020

View full text Add to dashboard Cite

Objectives: Predictions estimate supplies of filtering facepiece respirators (FFRs) would be limited in the event of a severe influenza pandemic. Ultraviolet decontamination and reuse (UVDR) is a potential approach to mitigate an FFR shortage. A field study sought to understand healthcare workers' perspectives and potential logistics issues related to implementation of UVDR methods for FFRs in hospitals. Methods: Data were collected at three hospitals using a structured guide to conduct 19 individual interviews, 103 focus group interviews, and 285 individual surveys. Data were then evaluated using thematic analysis to reveal key themes. Results: Data revealed noteworthy variation in FFR use across the sample, along with preferences and requirements for the use of UVDR, unit design, and FFR reuse. Based on a scale of 1 (low) to 10 (high), the mean perception of safety in a high mortality pandemic wearing no FFR was 1.25 of 10, wearing an FFR for an extended period without decontamination was 4.20 of 10, and using UVDR was 7.72 of 10. Conclusions: In addition to technical design and development, preparation and training will be essential to successful implementation of a UVDR program. Ultraviolet decontamination and reuse program design and implementation must account for actual clinical practice, compliance with regulations, and practical financial considerations to be successfully adopted so that it can mitigate potential FFR shortages in a pandemic.

show abstract

12 3 4 5

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.