We prove that maximum a posteriori estimators are well-deﬁned for diagonal Gaussian priors µ on lp under common assumptions on the potential Φ. Further, we show connections to the Onsager–Machlup functional and provide a corrected and strongly simpliﬁed proof in the Hilbert space case p = 2, previously established by Dashti et al. (2013); Kretschmann (2019).These corrections do not generalize to the setting 1 ≤ p < ∞, whichrequires a novel convexification result for the difference between the Cameron–Martin norm and the p-norm.

The Bayesian solution to a statistical inverse problem can be summarised by a mode of the posterior distribution, i.e. a maximum a posteriori (MAP) estimator. The MAP estimator essentially coincides with the (regularised) variational solution to the inverse problem, seen as minimisation of the Onsager–Machlup (OM) functional of the posterior measure. An open problem in the stability analysis of inverse problems is to establish a relationship between the convergence properties of solutions obtained by the variational approach and by the Bayesian approach. To address this problem, we propose a general convergence theory for modes that is based on the Γ-convergence of OM functionals, and apply this theory to Bayesian inverse problems with Gaussian and edge-preserving Besov priors. Part II of this paper considers more general prior distributions.

We derive Onsager–Machlup functionals for countable product measures on weighted ℓ
p
subspaces of the sequence space
R
N
. Each measure in the product is a shifted and scaled copy of a reference probability measure on
R
that admits a sufficiently regular Lebesgue density. We study the equicoercivity and Γ-convergence of sequences of Onsager–Machlup functionals associated to convergent sequences of measures within this class. We use these results to establish analogous results for probability measures on separable Banach or Hilbert spaces, including Gaussian, Cauchy, and Besov measures with summability parameter 1 ⩽ p ⩽ 2. Together with part I of this paper, this provides a basis for analysis of the convergence of maximum a posteriori estimators in Bayesian inverse problems and most likely paths in transition path theory.

Markov chain algorithms are ubiquitous in machine learning and statistics and many other disciplines. In this work we present a novel estimator applicable to several classes of Markov chains, dubbed Markov chain importance sampling (MCIS). For a broad class of Metropolis-Hastings algorithms, MCIS efficiently makes use of rejected proposals. For discretized Langevin diffusions, it provides a novel way of correcting the discretization error. Our estimator satisfies a central limit theorem and improves on error per CPU cycle, often to a large extent. As a by-product it enables estimating the normalizing constant, an important quantity in Bayesian machine learning and statistics.An efficient estimation of ρ θ for two classes of Markov chain algorithms, the Euler-Maruyama discretized Langevin diffusion (DL) and Metropolis-Hastings (MH), results in our integral estimators S MHIS K and S DLIS K , which strongly outperform the standard estimator. In addition, our estimators allow to approximate the normalizing constant Z of the target density, a quantity that is important for Bayesian model selection and averaging. This refutes the folk theorem that Metropolis-Hastings algorithms do not allow for an easy approximation of the normalizing constant Z. The paper is structured as follows. In Section 2, we introduce Markov chain importance sampling (MCIS) and derive explicit formulas for and approximations of ρ θ for MH and DL. Section 3 deals with convergence properties of S MHIS K , and a law of large numbers as well as a central limit theorem Preprint. Work in progress.

The linear conditional expectation (LCE) provides a best linear (or rather, affine) estimate of the conditional expectation and hence plays an important rôle in approximate Bayesian inference, especially the Bayes linear approach. This article establishes the analytical properties of the LCE in an infinite-dimensional Hilbert space context. In addition, working in the space of affine Hilbert-Schmidt operators, we establish a regularisation procedure for this LCE. As an important application, we obtain a simple alternative derivation and intuitive justification of the conditional mean embedding formula, a concept widely used in machine learning to perform the conditioning of random variables by embedding them into reproducing kernel Hilbert spaces.

When dealing with Bayesian inference the choice of the prior often remains a debatable question. Empirical Bayes methods offer a data-driven solution to this problem by estimating the prior itself from an ensemble of data. In the nonparametric case, the maximum likelihood estimate is known to overfit the data, an issue that is commonly tackled by regularization. However, the majority of regularizations are ad hoc choices which lack invariance under reparametrization of the model and result in inconsistent estimates for equivalent models. We introduce a nonparametric, transformation-invariant estimator for the prior distribution. Being defined in terms of the missing information similar to the reference prior, it can be seen as an extension of the latter to the data-driven setting. This implies a natural interpretation as a trade-off between choosing the least informative prior and incorporating the information provided by the data, a symbiosis between the objective and empirical Bayes methodologies.

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.