2014
DOI: 10.1007/978-3-319-13823-7_31
|View full text |Cite
|
Sign up to set email alerts
|

Safe Exploration Techniques for Reinforcement Learning – An Overview

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
38
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 56 publications
(38 citation statements)
references
References 16 publications
0
38
0
Order By: Relevance
“…For example, tailoring the insulin delivery policy of an artificial pancreas to the metabolism of an individual requires trial insulin delivery action but these should only be sampled when their outcome is within safe certainty bounds [44]. If safety is a significant concern in the systems' application domain, specifically designed safety-aware RL techniques may be required, see [149] and [64] for overviews of such techniques.…”
Section: A Classification Of Personalization Settingsmentioning
confidence: 99%
“…For example, tailoring the insulin delivery policy of an artificial pancreas to the metabolism of an individual requires trial insulin delivery action but these should only be sampled when their outcome is within safe certainty bounds [44]. If safety is a significant concern in the systems' application domain, specifically designed safety-aware RL techniques may be required, see [149] and [64] for overviews of such techniques.…”
Section: A Classification Of Personalization Settingsmentioning
confidence: 99%
“…Some directly enforce a set of safety rules to prevent certain actions leading to unsafe states (90); some encode task safety specifications in the reward function to encourage agents to explore safely (91). An overview for safe exploration in reinforcement learning can be found in (92). Although safe exploration reduces exploration failures, which helps maintain exploration continuity, it may limit exploration to a subspace so that the search may not be thorough (93).…”
Section: Active Learning and Explorationmentioning
confidence: 99%
“…In this work, we conceive a dynamic pandemic lockdown strategy that factors in public health infrastructure of a geographical region. The proposed approach built upon reinforcement learning (RL) allows agents to take decisions to maximize reward, while adapting to a complex and uncertain environment (Tuyls and Weiss 2012;Pecka and Svoboda 2014). We create an agent-based simulation environment running the ordinary differential equation-based SEIRD epidemic model (Hethcote 2000) (discussed in "Scenario" section).…”
Section: Issues In Vaccine Production and Supplymentioning
confidence: 99%