Dynamic Regret Minimization for Control of Non-stationary Linear Dynamical Systems

Luo, Yuwei; Gupta, Varun; Kolar, Mladen

doi:10.1145/3489048.3522649

Cited by 1 publication

(1 citation statement)

References 1 publication

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For partially observable systems strong regret guarantees are provided in . Luo et al (2022) provides an O(n 3/5 ) dynamic regret bound for the case when the system matrices (A t , B t ) can change over time. Their results are incompatible to ours in that they consider unknown dynamics, stochastic disturbances and the dynamic regret compete with controllers that are pointwise optimal (restricted dynamic regret), while we assume known dynamics, adversarial disturbances and compete with an arbitrary sequence of controllers (i.e., universal dynamic regret).…”

Section: Related Workmentioning

confidence: 99%

Optimal Dynamic Regret in LQR Control

Baby¹,

Wang²

2022

Preprint

View full text Add to dashboard Cite

We consider the problem of nonstochastic control with a sequence of quadratic losses, i.e., LQR control. We provide an efficient online algorithm that achieves an optimal dynamic (policy) regret of Õ(max{n 1/3 TV(M1:n) 2/3 , 1}), where TV(M1:n) is the total variation of any oracle sequence of Disturbance Action policies parameterized by M1, ..., Mn -chosen in hindsight to cater to unknown nonstationarity. The rate improves the best known rate of Õ( n(TV(M1:n) + 1)) for general convex losses and we prove that it is information-theoretically optimal for LQR. Main technical components include the reduction of LQR to online linear regression with delayed feedback due to Foster and Simchowitz (2020), as well as a new proper learning algorithm with an optimal Õ(n 1/3 ) dynamic regret on a family of "minibatched" quadratic losses, which could be of independent interest.

show abstract

Section: Related Workmentioning

confidence: 99%