2019
DOI: 10.1007/978-3-030-30281-8_9
|View full text |Cite
|
Sign up to set email alerts
|

SOS: Safe, Optimal and Small Strategies for Hybrid Markov Decision Processes

Abstract: For hybrid Markov decision processes, Uppaal Stratego can compute strategies that are safe for a given safety property and (in the limit) optimal for a given cost function. Unfortunately, these strategies cannot be exported easily since they are computed as a very long list. In this paper, we demonstrate methods to learn compact representations of the strategies in the form of decision trees. These decision trees are much smaller, more understandable, and can easily be exported as code that can be loaded into … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 19 publications
(12 citation statements)
references
References 48 publications
(64 reference statements)
0
11
0
Order By: Relevance
“…We use the mathematical modeling framework of hybrid Markov decision process (HMDP), adapted from Ashok et al (2019); Larsen et al (2016). Definition 1.…”
Section: Preliminariesmentioning
confidence: 99%
See 1 more Smart Citation
“…We use the mathematical modeling framework of hybrid Markov decision process (HMDP), adapted from Ashok et al (2019); Larsen et al (2016). Definition 1.…”
Section: Preliminariesmentioning
confidence: 99%
“…Therefore, the presented model setup should be adapted to an on-line model-predictive control setting. Second, to increase the explainability of the synthesized strategies, it is to be investigated whether exporting strategies to decision trees, see Ashok et al (2019), is possible. Third, it would be interesting to validate the approach with real-life data.…”
Section: Conclusion and Future Challengesmentioning
confidence: 99%
“…This results in formalisms called (concurrent) timed games [11,6], for which tools like UPPAAL-Tiga are available [4]. Nevertheless, the algorithmics of timed games is costly, and for instance, the mere existence of a controller is an EXPTIMEcomplete problem [10] and the resulting strategies can be very large [1]. Furthermore from a modelling point of view, often the exact timing constraints are unknown and indeed not needed to ensure the existence of a winning strategy.…”
Section: Introductionmentioning
confidence: 99%
“…However, if we do not store the controllers as lookup tables, but take advantage of decision trees (DT) [9], which exploit their hidden structure to represent them in a more compact way, we can mitigate this problem. As shown in [10], DTs can be orders of magnitude smaller than lookup tables. Such a concise representation opens the door for better readability, understandability, and explainability of the controllers, while reducing memory requirements and preserving correctness guarantees.…”
Section: Introductionmentioning
confidence: 99%
“…Alternatively, one can apply other kinds of reduction by determinization as post-processing after constructing the DT. For instance, in "safe pruning" of [10], the DT constructed for the maximally permissive controller is modified as follows. The leaves of the tree are merged in a bottom-up fashion, thereby reducing the size and partially determinizing it.…”
Section: Introductionmentioning
confidence: 99%