2021
DOI: 10.48550/arxiv.2105.12204
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Safe Value Functions

Abstract: The relationship between safety and optimality in control is not well understood, and they are often seen as important yet conflicting objectives. There is a pressing need to formalize this relationship, especially given the growing prominence of learning-based methods. Indeed, it is common practice in reinforcement learning to simply modify reward functions by penalizing failures, with the penalty treated as a mere heuristic. We rigorously examine this relationship, and formalize the requirements for safe val… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 26 publications
0
3
0
Order By: Relevance
“…Learning with CBFs: Approaches that use CBFs during learning typically assume that a valid CBF is already given, while we focus on constructing CBFs so that our approach can be viewed as complementary. In [19], it is shown how safe and optimal reward functions can be obtained, and how these are related to CBFs. The authors in [20] use CBFs to learn a provably correct neural network safety guard for kinematic bicycle models.…”
Section: A Related Workmentioning
confidence: 99%
“…Learning with CBFs: Approaches that use CBFs during learning typically assume that a valid CBF is already given, while we focus on constructing CBFs so that our approach can be viewed as complementary. In [19], it is shown how safe and optimal reward functions can be obtained, and how these are related to CBFs. The authors in [20] use CBFs to learn a provably correct neural network safety guard for kinematic bicycle models.…”
Section: A Related Workmentioning
confidence: 99%
“…These works typically assume that a CBF is already given, while we in this paper focus on constructing CBFs so that our approach should be viewed as complementary. In [17], it is shown how safe and optimal reward functions can be obtained and how these are related to CBFs. The authors in [18] learn a provably correct neural network safety guard for kinematic bicycle models using CBFs as safety filters.…”
Section: Related Workmentioning
confidence: 99%
“…Note that the RAU module encodes both safety and performance specifications in the reward function. It is shown in [64] that a big enough penalty function can assure that the optimal solution for the original task can be learned safely.…”
Section: B Risk Assessment Unit and Reward Design Using Preview Infor...mentioning
confidence: 99%