Online learning as an LQG optimal control problem with random matrices

Gnecco, Giorgio; Bemporad, Alberto; Gori, Marco; Morisi, Rita; Sanguineti, Marcello

doi:10.1109/ecc.2015.7330911

Cited by 1 publication

(5 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…No periodic re-initialization to Σ w of any of the matrices Σ k was performed. of the matrices Σ k , the Frobenius norm 19 of the Kalman gain matrix H k is expected to be small for k large (see formulas (24) and ( 47)), which is confirmed by Figure 8. Hence, even though the norm of the error y k − C k ŵ † k tends to increase when the parameter vector changes, the KF estimate of w at time k is not affected so much by this change (see formula (23)), hence also the OLL estimate does not change so much (see formula (22)).…”

Section: Remark 15mentioning

confidence: 52%

“…Online learning problems have been investigated, e.g., in [33,38,43,44,51,52], but without using an approach based on optimal control theory. As suggested by the preliminary results that we obtained in [24], such an approach can provide a strong theoretical foundation to the choice of a specific online learning algorithm, by selecting the parameter updates as the outputs of a sequence of control laws that solve a suitable optimal control problem modeling online learning itself 1 . A distinguishing feature of our study is that we derive online learning algorithms as closed-form optimal solutions to suitable online learning problems.…”

Section: Application Of Machine-learning Techniques To Optimization/o...mentioning

confidence: 99%

“…d) More complex models for the measurement errors: the measurement errors ε k could be have nonzero means, nonidentical distributions, and/or be not mutually independent. The first two cases can be dealt with in a straightforward way: indeed, in the first case one has only to subtract the expectation of ε k from the measure y k before presenting it as an input to the KF 16 , while in the second case one has to insert an additional index k to σ 2 ε , using terms of the form σ 2 ε k in the Kalman-filter recursion scheme (25) and in the Kalman gain matrix (24). Finally, in the correlated case one could model the measurement noise as the output of an auxiliary uncontrolled linear dynamical system, which receives mutually independent noises as inputs.…”

Section: Remark 15mentioning

confidence: 99%

“…which is initialized by ê•, † −1 := 0, where the Kalman gain matrix H k+1 is defined in (24). Moreover, since e and ŵ• k is known at the time k, the KF estimate ŵ † k of w at the time k, based on the information vector I k , satisfies 24 ê•, †…”

Section: Proof Of Propositionmentioning

confidence: 99%

“…We recall here the equations ( 25), (24), and (23), which are needed to compute the KF estimate ŵ † k :…”

Section: Proof Of Propositionmentioning

confidence: 99%

See 4 more Smart Citations

LQG Online Learning

et al. 2017

Self Cite

View full text Add to dashboard Cite

Optimal control theory and machine learning techniques are combined to formulate and solve in closed form an optimal control formulation of online learning from supervised examples with regularization of the updates. The connections with the classical Linear Quadratic Gaussian (LQG) optimal control problem, of which the proposed learning paradigm is a non-trivial variation as it involves random matrices, are investigated. The obtained optimal solutions are compared with the Kalman-filter estimate of the parameter vector to be learned. It is shown that the proposed algorithm is less sensitive to outliers with respect to the Kalman estimate (thanks to the presence of the regularization term), thus providing smoother estimates with respect to time. The basic formulation of the proposed onlinelearning framework refers to a discrete-time setting with a finite learning horizon and a linear model. Various extensions are investigated, including the infinite learning horizon and, via the so-called "kernel trick", the case of nonlinear models.

show abstract