2019 IEEE 58th Conference on Decision and Control (CDC) 2019
DOI: 10.1109/cdc40024.2019.9029423
|View full text |Cite
|
Sign up to set email alerts
|

Learning Safe Policies via Primal-Dual Methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
31
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 32 publications
(31 citation statements)
references
References 14 publications
0
31
0
Order By: Relevance
“…μi (E [ i (φ(x), y)] − c i ) ≥ P , (18) for all φ ∈ H. Note, however, that the left-hand side of ( 18) is the Lagrangian (8), i.e., (18) implies that L(φ, μ) ≥ P . In particular, this hold for the minimum of L(φ, μ), implying that D ≥ P and therefore, that strong duality holds for (P-CSL).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation

Constrained Learning with Non-Convex Losses

Chamon,
Paternain,
Calvo-Fullana
et al. 2021
Preprint
Self Cite
“…μi (E [ i (φ(x), y)] − c i ) ≥ P , (18) for all φ ∈ H. Note, however, that the left-hand side of ( 18) is the Lagrangian (8), i.e., (18) implies that L(φ, μ) ≥ P . In particular, this hold for the minimum of L(φ, μ), implying that D ≥ P and therefore, that strong duality holds for (P-CSL).…”
Section: Discussionmentioning
confidence: 99%
“…As these systems become ubiquitous, however, so does the need to constrain their behavior to tackle problems in fairness [7]- [12], robustness [13]- [15], and safety [16]- [18]. Left untethered, learning can lead to biased, prejudiced models or systems prone to tampering (e.g., adversarial examples), and unsafe behaviors [19]- [22].…”
Section: Introductionmentioning
confidence: 99%

Constrained Learning with Non-Convex Losses

Chamon,
Paternain,
Calvo-Fullana
et al. 2021
Preprint
Self Cite
“…However, it turns out that the gradient ∇ θ p s (π) is rather difficult to compute, which is also a major challenge in chance constrained problems [20], [23]. Previous researchers in chance constrained RL usually replace ∇ θ p s (π) with the gradient of a lower bound of p s without sufficient theoretical guarantees [14], [15]. In this paper, we introduce an analytical approximated gradient with theoretical basis [23].…”
Section: B Analytical Gradient For Safe Probabilitymentioning
confidence: 99%
“…Recently, some RL researchers begin to investigate including different forms of safety constraints in RL algorithms to improve safety for real-world applications [10]- [13]. One of the most popular forms is the chance constraint, which constrains the possibility of the control policy violating the state constraint below a given level [10], [14], [15]. Chance constraint gives an intuitive and quantitative measure of the safety level of the control policy, so it is suitable to represent the safety demands in real-world systems with uncertainty.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation