2015
DOI: 10.1137/140962802
|View full text |Cite
|
Sign up to set email alerts
|

Kullback--Leibler Approximation for Probability Measures on Infinite Dimensional Spaces

Abstract: Abstract. In a variety of applications it is important to extract information from a probability measure μ on an infinite dimensional space. Examples include the Bayesian approach to inverse problems and (possibly conditioned) continuous time Markov processes. It may then be of interest to find a measure ν, from within a simple class of measures, which approximates μ. This problem is studied in the case where the Kullback-Leibler divergence is employed to measure the quality of the approximation. A calculus of… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
67
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 51 publications
(67 citation statements)
references
References 21 publications
0
67
0
Order By: Relevance
“…Swapping the order of these two measures within the divergence is undesirable for our purposes. This is because minimizing D KL (µ ε ||·) within the set of all Gaussian measures will lead to matching of moments [3]; this is inappropriate for multimodal measures where a more desirable outcome would be the existence of multiple local minimizers at each mode [23,22]. Although the Kullback-Leibler divergence is not a metric, its information theoretic interpretation make it natural for approximate inference.…”
Section: Set-upmentioning
confidence: 99%
“…Swapping the order of these two measures within the divergence is undesirable for our purposes. This is because minimizing D KL (µ ε ||·) within the set of all Gaussian measures will lead to matching of moments [3]; this is inappropriate for multimodal measures where a more desirable outcome would be the existence of multiple local minimizers at each mode [23,22]. Although the Kullback-Leibler divergence is not a metric, its information theoretic interpretation make it natural for approximate inference.…”
Section: Set-upmentioning
confidence: 99%
“…Contrary to the evidence for minimizing the approximation KL, we find many examples where an approximation is made by minimizing other functionals; for example, minimizing the inference KL (e.g., [7][8][9][10][11][12][13]). For many but not all of them, this is because minimizing the approximation KL is not feasible in practice in their case due to the real distribution p not being accessible.…”
Section: Introductionmentioning
confidence: 66%
“…However, the method only gives rise to maximum a posteriori or maximum likelihood solutions, which corresponds to optimizing the δ-loss of Equation (2). In Reference [11], it is claimed that minimizing the inference KL yields more desirable results since for multi-modal distributions, individual modes can be fitted with a mono-modal distribution such as a Gaussian distribution, whereas the resulting distribution has a very large variance when minimizing the approximation KL to account for all modes. In Figure 1 there is an example of this behavior.…”
Section: Discussionmentioning
confidence: 99%
“…In a closely related direction, information metrics provide systematic, practical, and widely used tools to build approximate statistical models of reduced complexity through variational inference methods [64,6,74] for machine learning [98,38,6], and coarse-graining of complex systems at equilibrium [84,10,79,4,5,28]. However, dynamics are of critical importance in reaction networks and such earlier works on equilibrium coarse-graining are not applicable.…”
Section: Introductionmentioning
confidence: 99%