This paper presents a new Metropolis-adjusted Langevin algorithm (MALA) that uses convex analysis to simulate efficiently from high-dimensional densities that are log-concave, a class of probability distributions that is widely used in modern high-dimensional statistics and data analysis. The method is based on a new first-order approximation for Langevin diffusions that exploits log-concavity to construct Markov chains with favourable convergence properties. This approximation is closely related to Moreau-Yoshida regularisations for convex functions and uses proximity mappings instead of gradient mappings to approximate the continuous-time process. The proposed method complements existing MALA methods in two ways. First, the method is shown to have very robust stability properties and to converge geometrically for many target densities for which other MALA are not geometric, or only if the step size is sufficiently small. Second, the method can be applied to high-dimensional target densities that are not continuously differentiable, a class of distributions that is increasingly used in image processing and machine learning and that is beyond the scope of existing MALA and HMC algorithms. To use this method it is necessary to compute or to approximate efficiently the proximity mappings of the logarithm of the target density. For several popular models, including many Bayesian models used in modern signal and image processing and machine learning, this can be achieved with convex optimisation algorithms and with approximations based on proximal splitting techniques, which can be implemented in parallel. The proposed method is demonstrated on two challenging high-dimensional and non-differentiable models related to image resolution enhancement and low-rank matrix estimation that are not well addressed by existing MCMC methodology.
Modern imaging methods rely strongly on Bayesian inference techniques to solve challenging imaging problems. Currently, the predominant Bayesian computation approach is convex optimisation, which scales very efficiently to high dimensional image models and delivers accurate point estimation results. However, in order to perform more complex analyses, for example image uncertainty quantification or model selection, it is necessary to use more computationally intensive Bayesian computation techniques such as Markov chain Monte Carlo methods. This paper presents a new and highly efficient Markov chain Monte Carlo methodology to perform Bayesian computation for high dimensional models that are log-concave and non-smooth, a class of models that is central in imaging sciences. The methodology is based on a regularised unadjusted Langevin algorithm that exploits tools from convex analysis, namely Moreau-Yoshida envelopes and proximal operators, to construct Markov chains with favourable convergence properties. In addition to scaling efficiently to high dimensions, the method is straightforward to apply to models that are currently solved by using proximal optimisation algorithms. We provide a detailed theoretical analysis of the proposed methodology, including asymptotic and non-asymptotic convergence results with easily verifiable conditions, and explicit bounds on the convergence rates. The proposed methodology is demonstrated with four experiments related to image deconvolution and tomographic reconstruction with total-variation and 1 priors, where we conduct a range of challenging Bayesian analyses related to uncertainty quantification, hypothesis testing, and model selection in the absence of ground truth.
Recent decades have seen enormous improvements in computational inference for statistical models; there have been competitive continual enhancements in a wide range of computational tools. In Bayesian inference, first and foremost, MCMC techniques have continued to evolve, moving from random walk proposals to Langevin drift, to Hamiltonian Monte Carlo, and so on, with both theoretical and algorithmic innovations opening new opportunities to practitioners. However, this impressive evolution in capacity is confronted by an even steeper increase in the complexity of the datasets to be addressed. The difficulties of modelling and then handling ever more complex datasets most likely call for a new type of tool for computational inference that dramatically reduces the dimension and size of the raw data while capturing its essential aspects. Approximate models and algorithms may thus be at the core of the next computational revolution.
International audienceModern signal processing (SP) methods rely very heavily on probability and statistics to solve challenging SP problems. SP methods are now expected to deal with ever more complex models, requiring ever more sophisticated computational inference techniques. This has driven the development of statistical SP methods based on stochastic simulation and optimization. Stochastic simulation and optimization algorithms are computationally intensive tools for performing statistical inference in models that are anal ytically intractable and beyond the scope of deterministic inference methods. They have been recently successfully applied to many difficult problems involving complex statistical models and sophisticated (often Bayesian) statistical inference techniques. This survey paper offers an introduction to stochastic simulation and optimization methods in signal and image processing. The paper addresses a variety of high-dimensional Markov chain Monte Carlo (MCMC) methods as well as deterministic surrogate methods, such as variational Bayes, the Bethe approach, belief and expectation propagation and approximate message passing algorithms. It also discusses a range of optimization methods that have been adopted to solve stochastic problems, as well as stochastic methods for deterministic optimization. Subsequently, area as of overlap between simulation and optimization, in particular optimization-within-MCMC and MCMC-driven optimization are discussed
This paper addresses the problem of estimating the Potts parameter β jointly with the unknown parameters of a Bayesian model within a Markov chain Monte Carlo (MCMC) algorithm. Standard MCMC methods cannot be applied to this problem because performing inference on β requires computing the intractable normalizing constant of the Potts model. In the proposed MCMC method the estimation of β is conducted using a likelihood-free Metropolis-Hastings algorithm. Experimental results obtained for synthetic data show that estimating β jointly with the other unknown parameters leads to estimation results that are as good as those obtained with the actual value of β. On the other hand, assuming that the value of β is known can degrade estimation performance significantly if this value is incorrect. To illustrate the interest of this method, the proposed algorithm is successfully applied to real bidimensional SAR and tridimensional ultrasound images. Index TermsPotts-Markov field, Mixture model, Bayesian estimation, Gibbs sampler, Intractable normalizing constants.arXiv:1207.5355v1 [stat.CO] 23 Jul 2012 resulting in the so-called pseudo-likelihood estimators [20]. Although analytically convenient this approach generally does not lead to a satisfactory posterior density and results in poor estimation [21]. Also, as noticed in [18] such a prior distribution generally depends on the data since the normalizing constant C(β) depends implicitly on the number of observations (priors that depend on the data are not recommended in the Bayesian paradigm [22, p. 36]). B. Approximation of C(β)Another possibility is to approximate the normalizing constant C(β). Existing approximations can be classified into three categories: based on analytical developments, on sampling strategies or on a combination of both. A survey of the state-of-the-art approximation methods up to 2004 has been presented in [18]. The methods considered in [18] are the mean field, the tree-structured mean field and the Bethe energy (loopy Metropolis) approximations, as well as two sampling strategies based on Langevin MCMC algorithms. More recently, exact recursive expressions have been proposed to compute C(β) analytically [9]. However, to our knowledge, these recursive methods have only been successfully applied to small problems (i.e., for MRFs of size smaller than 40 × 40) with reduced spatial correlation β < 0.5.Another sampling-based approximation consists in estimating C(β) by Monte Carlo integration [23, Chap. 3], at the expense of very substantial computation and possibly biased estimations (bias arises from the estimation error of C(β)). Better results can be obtained by using importance or path sampling methods [24]. These methods have been applied to the estimation of β within an MCMC image processing algorithm in [17]. Although more precise than Monte Carlo integration, approximating C(β) by importance or path sampling still requires substantial computation and is generally unfeasible for large fields. This has motivated recent works that reduce computati...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.