2007
DOI: 10.1162/neco.2007.19.4.1097
|View full text |Cite
|
Sign up to set email alerts
|

State-Space Models: From the EM Algorithm to a Gradient Approach

Abstract: Slow convergence is observed in the EM algorithm for linear state-space models. We propose to circumvent the problem by applying any off-the-shelf quasi-Newton-type optimizer, which operates on the gradient of the log-likelihood function. Such an algorithm is a practical alternative due to the fact that the exact gradient of the log-likelihood function can be computed by recycling components of the expectation-maximization (EM) algorithm. We demonstrate the efficiency of the proposed method in three relevant i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2009
2009
2017
2017

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 18 publications
(14 citation statements)
references
References 10 publications
0
14
0
Order By: Relevance
“…It is well-known that EM can converge slowly, especially in cases in which the so-called “ratio of missing information” is large; see [19, 57, 88, 68] for details. In practice, we have found that direct gradient ascent of expression (12) is significantly more efficient than the EM approach in the models discussed in this paper; for example, we used the direct approach to perform parameter estimation in the retinal example discussed above in section 2.3.…”
Section: Parameter Estimationmentioning
confidence: 99%
“…It is well-known that EM can converge slowly, especially in cases in which the so-called “ratio of missing information” is large; see [19, 57, 88, 68] for details. In practice, we have found that direct gradient ascent of expression (12) is significantly more efficient than the EM approach in the models discussed in this paper; for example, we used the direct approach to perform parameter estimation in the retinal example discussed above in section 2.3.…”
Section: Parameter Estimationmentioning
confidence: 99%
“…Nevertheless, this initial assumption enabled us to investigate the invariance (or lack of) of the state vector with respect to the kinematics of the behavior, as described in Section 3. Parameters θ ={ A, C, Q, R } are typically estimated by maximizing their log-likelihood L(θ), through either gradient-based maximum-likelihood methods, or the expectation-maximization (EM) algorithm (Olsson et al 2007). …”
Section: Methodsmentioning
confidence: 99%
“…For example, we may use the MAP path to approximate the conditional expectations E(x i |y), and use the diagonal elements of the inverse Hessian matrix J to approximate the second moments V ar(x i |y) (recall that these diagonal elements of the inverse Hessian may be obtained in O(N ) time, because the full matrix inverse is not required [36][37][38]). Although the EM algorithm is fairly standard, many authors have reported that convergence tends to be unnecessarily slow ( [40,41] and references therein). We have seen similar effects here; in practice, we have found that directly optimizing the marginal likelihood is much faster than iterating the EM algorithm to convergence.…”
Section: Em Algorithmmentioning
confidence: 99%
“…Thus, the computation of the gradient of log marginal likelihood function can in general be carried out via the E-step of the EM algorithm [16,40,41]. In our case, the gradient of the log-likelihood function (55) can be connected explicitly to the Laplace approximation (46).…”
Section: Em Algorithmmentioning
confidence: 99%