2017
DOI: 10.1007/978-3-319-66790-4_5
|View full text |Cite
|
Sign up to set email alerts
|

Assurance in Reinforcement Learning Using Quantitative Verification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
13
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
3
1

Relationship

3
4

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 36 publications
1
13
0
Order By: Relevance
“…This area of future work was made possible by the recent adoption of our approach within several projects carried out by teams that include researchers and engineers not involved in the EvoChecker development. These projects have used or will use EvoChecker to devise safe reinforcement learning solutions (Mason et al 2017(Mason et al , 2018, to synthesise robust designs for software-based systems (Calinescu et al 2017b, c), and to suggest safe evacuation routes for communities affected by adverse events such as natural disasters. This will show how easy it is to define and validate EvoChecker models and requirements in real applications, allowing us to improve the usability of the approach.…”
Section: Discussionmentioning
confidence: 99%
“…This area of future work was made possible by the recent adoption of our approach within several projects carried out by teams that include researchers and engineers not involved in the EvoChecker development. These projects have used or will use EvoChecker to devise safe reinforcement learning solutions (Mason et al 2017(Mason et al , 2018, to synthesise robust designs for software-based systems (Calinescu et al 2017b, c), and to suggest safe evacuation routes for communities affected by adverse events such as natural disasters. This will show how easy it is to define and validate EvoChecker models and requirements in real applications, allowing us to improve the usability of the approach.…”
Section: Discussionmentioning
confidence: 99%
“…In recent work, we used Markov decision processes (MDPs) to model an assisted-living SCPS developed to help dementia sufferers with the daily task of hand-washing [12]. The SCPS provided voice prompts to the sufferers in certain MDP states, to guide them through what they must do next, if they were struggling to progress.…”
Section: Examplementioning
confidence: 99%
“…Example 3 Consider again the route-planning and assistedliving SCPS from Examples 1 and 2. While probabilistic temporal logics were successfully used to specify requirements associated with the risks and duration of evacuation routes [1] and with the sequence of voice prompts provided to dementia sufferers [12], these logics cannot easily express requirements such as the interactions between evacuees who use the same route, or the distress experienced by sufferers who receive too many reminders or do not see their carers for long periods of time (open challenge OC1). Furthermore, the effectiveness of these SCPS depends on the accuracy with which events (e.g., damage to the road infrastructure) in the evacuated area and sufferer response to voice prompts, respectively, are mapped to state transition probabilities within the stochastic models that underpin decision making in these systems (OC2).…”
Section: Oc2) Ensuring the Accuracy Of Stochastic Models Of Scps-mentioning
confidence: 99%
“…To alleviate this problem, (D)RL algorithms are being combined with formal verification techniques to ensure safety in learning. Even though significant progress has been achieved in this direction [1,5,9,12,19,22], settings with multiple learning agents are comparatively less explored and understood.…”
Section: Introductionmentioning
confidence: 99%
“…In this paper we introduce assured multi-agent reinforcement learning (AMARL), a method to formally guarantee the safe behaviour of agents acting in an unknown environment through the satisfaction of safety constraints by the solution learned using a DRL algorithm, both at training and test time. Building upon the assured reinforcement learning (ARL) technique in [19], we combine reinforcement learning and formal verification [13] to ensure the satisfaction of constraints expressed in Probabilistic Computation Tree Logic (PCTL) [11]. Differently from ARL, we support a multi-agent setting and DRL algorithms.…”
Section: Introductionmentioning
confidence: 99%