2021
DOI: 10.48550/arxiv.2112.12228
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Direct Behavior Specification via Constrained Reinforcement Learning

Abstract: The standard formulation of Reinforcement Learning lacks a practical way of specifying what are admissible and forbidden behaviors. Most often, practitioners go about the task of behavior specification by manually engineering the reward function, a counter-intuitive process that requires several iterations and is prone to reward hacking by the agent. In this work, we argue that constrained RL, which has almost exclusively been used for safe RL, also has the potential to significantly reduce the amount of work … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 29 publications
0
1
0
Order By: Relevance
“…25, where λ is a Lagrange multiplier. In our experiments, we fix λ to be a constant scalar, although adaptive approaches such as (Roy et al, 2021) could be explored.…”
Section: Regularization With Lagrangian Penaltiesmentioning
confidence: 99%
“…25, where λ is a Lagrange multiplier. In our experiments, we fix λ to be a constant scalar, although adaptive approaches such as (Roy et al, 2021) could be explored.…”
Section: Regularization With Lagrangian Penaltiesmentioning
confidence: 99%