2023
DOI: 10.1109/tac.2022.3152724
|View full text |Cite
|
Sign up to set email alerts
|

Safe Policies for Reinforcement Learning via Primal-Dual Methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
12
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 33 publications
(15 citation statements)
references
References 40 publications
0
12
0
Order By: Relevance
“…RL with constraints: First, constraints that require some expected cumulative costs over all steps to be bounded have been widely studied in safe RL [19,20,21,8,22,23,24,9,25,26,10,27,28,11,29,30]. Second, many other work, e.g., [31] and [32], studied budget constraints that will halt the learning process whenever the budget has run out of.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…RL with constraints: First, constraints that require some expected cumulative costs over all steps to be bounded have been widely studied in safe RL [19,20,21,8,22,23,24,9,25,26,10,27,28,11,29,30]. Second, many other work, e.g., [31] and [32], studied budget constraints that will halt the learning process whenever the budget has run out of.…”
Section: Related Workmentioning
confidence: 99%
“…in Step-3 of Algorithm 1, we update the regularized least-square estimator of the parameter w * h in (11) as follows:…”
Section: A Near-optimal Safe Algorithmmentioning
confidence: 99%
See 2 more Smart Citations
“…It is applicable to many constrained control problems by integrating other system specifications in constraints, and admits a natural extension of constrained optimization and Lagrangian over policies. Lagrangian-based policy search methods, especially policy-based primal-dual algorithms that work simultaneously with primal/dual variables, lie at the heart of recent successes of constrained MDPs, e.g., navigation [77], autonomous driving [54,49], robotics [25], and finance [31]; see [44,35,20,48] for more examples.…”
Section: Introductionmentioning
confidence: 99%