2009 International Joint Conference on Neural Networks 2009
DOI: 10.1109/ijcnn.2009.5178726
|View full text |Cite
|
Sign up to set email alerts
|

A gradient-based algorithm competitive with variational Bayesian EM for mixture of Gaussians

Abstract: Abstract-While variational Bayesian (VB) inference is typically done with the so called VB EM algorithm, there are models where it cannot be applied because either the E-step or the M-step cannot be solved analytically. In 2007, Honkela et al. introduced a recipe for a gradient-based algorithm for VB inference that does not have such a restriction. In this paper, we derive the algorithm in the case of the mixture of Gaussians model. For the first time, the algorithm is experimentally compared to VB EM and its … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…This matrix is singular due to the over-parameterised nature of the softmax function, which makes computing the natural gradient problematic. Kuusela et al [2009] suggests omitting the first element of γ n , writing γ n = [γ n2 , . .…”
Section: The Natural Gradient In Softmaxmentioning
confidence: 99%
“…This matrix is singular due to the over-parameterised nature of the softmax function, which makes computing the natural gradient problematic. Kuusela et al [2009] suggests omitting the first element of γ n , writing γ n = [γ n2 , . .…”
Section: The Natural Gradient In Softmaxmentioning
confidence: 99%
“…The majority of previous gradient-based statistical inference approaches employ the flat space approximation of the conjugate gradient [28][29] [30], based on the rationale that the minimization of functions on a Riemannian manifold is locally equivalent to the minimization on an Euclidian space (since every Riemannian manifold can be isometrically embedded in an Euclidean space). However, as the statistical space is a curved manifold, most of the Euclidean space operations become undefined.…”
Section: The Optimization Algorithmmentioning
confidence: 99%
“…It turns out that the KLC bound is particularly amenable to Riemannian or natural gradient methods, because the information geometry of the exponential family distrubution(s), over which we are optimising, leads to a simple expression for the natural gradient. Previous investigations of natural gradients for variational Bayes [Honkela et al, 2010, Kuusela et al, 2009 required the inversion of the Fisher information at every step (ours does not), and also used VBEM steps for some parameters and Riemannian optimisation for other variables. The collapsed nature of the KLC bound means that these VBEM steps are unnecessary: the bound can be computed by parameterizing the dis-tribution of only one set of variables (q(Z)) whilst the implicit distribution of the other variables is given in terms of the first distribution and the data by equation ( 5).…”
Section: Applicabilitymentioning
confidence: 99%
“…where •, • i denotes the inner product in Riemannian geometry, which is given by g G(ρ) g. We note from Kuusela et al [2009] that this can be simplified since g G g = g GG −1 g = g g, and other conjugate methods, defined in the supplementary material, can be applied similarly.…”
Section: Conjugate Gradient Optimizationmentioning
confidence: 99%