Minimal Expected Regret in Linear Quadratic Control

Jedra, Yassir; Proutière, Alexandre

doi:10.48550/arxiv.2109.14429

Cited by 2 publications

(3 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The study of regret in online LQR was re-initiated by Abbasi-Yadkori and Szepesvári (2011), inspired by works in the RL community. Many works followed to propose algorithms which were computationally tractable (Ouyang et al, 2017;Dean et al, 2018;Abeille and Lazaric, 2018;Cohen et al, 2019;Faradonbeh et al, 2020;Jedra and Proutiere, 2021). Lower bounds on the regret of online LQR are presented in Simchowitz and Foster (2020); Cassel et al (2020); Ziemann and Sandberg (2022).…”

Section: Related Workmentioning

confidence: 99%

The Fundamental Limitations of Learning Linear-Quadratic Regulators

Lee¹,

Ziemann²,

Tsiamis³

et al. 2023

Preprint

View full text Add to dashboard Cite

We present a local minimax lower bound on the excess cost of designing a linear-quadratic controller from offline data. The bound is valid for any offline exploration policy that consists of a stabilizing controller and an energy bounded exploratory input. The derivation leverages a relaxation of the minimax estimation problem to Bayesian estimation, and an application of Van Trees' inequality. We show that the bound aligns with system-theoretic intuition. In particular, we demonstrate that the lower bound increases when the optimal control objective value increases. We also show that the lower bound increases when the system is poorly excitable, as characterized by the spectrum of the controllability gramian of the system mapping the noise to the state and the H ∞ norm of the system mapping the input to the state. We further show that for some classes of systems, the lower bound may be exponential in the state dimension, demonstrating exponential sample complexity for learning the linear-quadratic regulator offline.

show abstract

Section: Related Workmentioning

confidence: 99%

The Fundamental Limitations of Learning Linear-Quadratic Regulators

Lee¹,

Ziemann²,

Tsiamis³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…In a closely related line of work, Dean et al [2018] provide an O(T 2/3 ) regret bound for robust adaptive LQR control, drawing inspiration from classical methods in system identification and robust adaptive control. It has since been shown that certainty equivalent control, without robustness, can attain the (locally) minimax optimal O( √ T ) regret [Mania et al, 2019, Faradonbeh et al, 2020, Lale et al, 2020a, Jedra and Proutiere, 2021. In particular, by providing nearly matching upper and lower bounds, Simchowitz and Foster [2020] refine this analysis and establish that the optimal rate, without taking system theoretic quantities into account, is R T = Θ( p 2 nT ).…”

Section: Related Workmentioning

confidence: 99%

“…The goal is to learn a linear gain K ∈ R m×n such that the closed-loop system A + BK is stable, i.e., such that its spectral radius ρ(A + BK) is less than one. Many algorithms for online LQR require the existence of such a stabilizing gain to initialize the online learning policy Foster, 2020, Jedra andProutiere, 2021]. Furthermore, stabilization is a problem of independent interest [Faradonbeh et al, 2018b].…”

Section: Introductionmentioning

confidence: 99%

Learning to Control Linear Systems can be Hard

Tsiamis¹,

Ziemann²,

Morari³

et al. 2022

Preprint

View full text Add to dashboard Cite

In this paper, we study the statistical difficulty of learning to control linear systems. We focus on two standard benchmarks, the sample complexity of stabilization, and the regret of the online learning of the Linear Quadratic Regulator (LQR). Prior results state that the statistical difficulty for both benchmarks scales polynomially with the system state dimension up to systemtheoretic quantities. However, this does not reveal the whole picture. By utilizing minimax lower bounds for both benchmarks, we prove that there exist non-trivial classes of systems for which learning complexity scales dramatically, i.e. exponentially, with the system dimension. This situation arises in the case of underactuated systems, i.e. systems with fewer inputs than states. Such systems are structurally difficult to control and their system theoretic quantities can scale exponentially with the system dimension dominating learning complexity. Under some additional structural assumptions (bounding systems away from uncontrollability), we provide qualitatively matching upper bounds. We prove that learning complexity can be at most exponential with the controllability index of the system, that is the degree of underactuation.

show abstract

Minimal Expected Regret in Linear Quadratic Control

Cited by 2 publications

References 12 publications

The Fundamental Limitations of Learning Linear-Quadratic Regulators

The Fundamental Limitations of Learning Linear-Quadratic Regulators

Learning to Control Linear Systems can be Hard

Contact Info

Product

Resources

About