2021 American Control Conference (ACC) 2021
DOI: 10.23919/acc50511.2021.9482889
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive Shielding under Uncertainty

Abstract: This paper targets control problems that exhibit specific safety and performance requirements. In particular, the aim is to ensure that an agent, operating under uncertainty, will at runtime strictly adhere to such requirements. Previous works create so-called shields that correct an existing controller for the agent if it is about to take unbearable safety risks. However, so far, shields do not consider that an environment may not be fully known in advance and may evolve for complex control and learning tasks… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(9 citation statements)
references
References 25 publications
(42 reference statements)
0
7
0
Order By: Relevance
“…For future work, we plan to investigate the application of online shielding in other settings, such as decision making in robotics and control. Another interesting extension would be to incorporate quantitative performance measures in the form of rewards and costs into the computation of the online shield, as previously demonstrated in an offline manner [4] and in a hybrid approach [28], where runtime information was used to learn the environment dynamics.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…For future work, we plan to investigate the application of online shielding in other settings, such as decision making in robotics and control. Another interesting extension would be to incorporate quantitative performance measures in the form of rewards and costs into the computation of the online shield, as previously demonstrated in an offline manner [4] and in a hybrid approach [28], where runtime information was used to learn the environment dynamics.…”
Section: Discussionmentioning
confidence: 99%
“…Shields are usually constructed offline by computing a maximally permissive policy that contains all actions that will not violate the safety specification. Several extensions exist [6,37,4,28]. The shielding approach has been shown to be successful in combination with RL [2,21].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In [5] and [6], shields were used to ensure the safety of human-interactive robotics. A shield that adapted to a changing environment was proposed and applied to traffic light controllers to maintain the correct traffic flow [7]. Safe reinforcement learning using shields was proposed in [8] and [9].…”
Section: Introductionmentioning
confidence: 99%
“…Another approach is verifying safety on the fly using MPC safety certification [22]. A similar research line is adaptive reinforcement learning [23], [24], where safety is computed for the next k steps and unsafe actions are blocked. There is an intrinsic trade-off when choosing the number k of steps.…”
Section: Introductionmentioning
confidence: 99%