2023
DOI: 10.1109/tac.2023.3234176
|View full text |Cite
|
Sign up to set email alerts
|

Global Convergence of Policy Gradient Primal–Dual Methods for Risk-Constrained LQRs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 13 publications
(17 citation statements)
references
References 23 publications
0
17
0
Order By: Relevance
“…For reinforcement learning (RL) [9], the focus has been on learning the system dynamics and providing closed-loop guarantees in finite-time for both linear [16,24,48] and nonlinear systems [7,37,49,76]. For model-free RL, [32,62,66,100] proved the convergence of policy optimization to the optimal controller for LTI systems, [63,67] for LTV systems, [82] for partially observed linear systems. For a review of policy optimization (PO) methods for LQR, H ∞ control, risk-sensitive control, LQG, and output feedback synthesis, see [34].…”
Section: Control Design Problems For Hyperbolic Pdes Are Hyperbolicmentioning
confidence: 99%
“…For reinforcement learning (RL) [9], the focus has been on learning the system dynamics and providing closed-loop guarantees in finite-time for both linear [16,24,48] and nonlinear systems [7,37,49,76]. For model-free RL, [32,62,66,100] proved the convergence of policy optimization to the optimal controller for LTI systems, [63,67] for LTV systems, [82] for partially observed linear systems. For a review of policy optimization (PO) methods for LQR, H ∞ control, risk-sensitive control, LQG, and output feedback synthesis, see [34].…”
Section: Control Design Problems For Hyperbolic Pdes Are Hyperbolicmentioning
confidence: 99%
“…In this paper, we take an iterative PO perspective to solve (7) viewing G as the optimization matrix. We aim to design a gradient-based method to find an optimal G while maintaining feasibility, and recover the control from ( 5) as 7) is a challenging constrained nonconvex problem, we leverage a novel convex parameterization to establish the global convergence.…”
Section: B Direct Data-driven Formulationmentioning
confidence: 99%
“…In this section, we first present our novel PO method for solving (7). Then, we propose a new strongly convex parameterization of (7) to derive the projected gradient dominance property of J(G).…”
Section: Data-enabled Policy Optimizationmentioning
confidence: 99%
See 1 more Smart Citation
“…The main focus has been on learning the system dynamics and providing closed-loop guarantees in finite-time for both linear systems [15], [23], [29], [42], [77] (and references within), and nonlinear systems [5], [35], [43], [71]. For model-free RL methods, [30], [56], [60], [90] proved the convergence of policy optimization, a popular modelfree RL method, to the optimal controller for linear timeinvariant systems, [58], [61] for linear time-varying systems, [75] for partially observed linear systems. See [32] for a recent review of policy optimization methods for continuous control problems such as the LQR, H ∞ control, risk-sensitive control, LQG, and output feedback synthesis.…”
Section: Introductionmentioning
confidence: 99%