SOS: Safe, Optimal and Small Strategies for Hybrid Markov Decision Processes

Ashok, Pranav; Křetínský, Jan; Larsen, Kim Guldstrand; Coënt, Adrien Le; Taankvist, Jakob Haahr; Weininger, Maximilian

doi:10.1007/978-3-030-30281-8_9

Cited by 19 publications

(12 citation statements)

References 48 publications

(64 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We use the mathematical modeling framework of hybrid Markov decision process (HMDP), adapted from Ashok et al (2019); Larsen et al (2016). Definition 1.…”

Section: Preliminariesmentioning

confidence: 99%

“…Therefore, the presented model setup should be adapted to an on-line model-predictive control setting. Second, to increase the explainability of the synthesized strategies, it is to be investigated whether exporting strategies to decision trees, see Ashok et al (2019), is possible. Third, it would be interesting to validate the approach with real-life data.…”

Section: Conclusion and Future Challengesmentioning

confidence: 99%

See 1 more Smart Citation

Learning Safe and Optimal Control Strategies for Storm Water Detention Ponds

Goorden¹,

Larsen²,

Nielsen³

et al. 2021

Preprint

View full text Add to dashboard Cite

Storm water detention ponds are used to manage the discharge of rainfall runoff from urban areas to nearby streams. Their purpose is to reduce the hydraulic impact and sediment loads of the receiving waters. Detention ponds are currently designed based on static controls: the output flow of a pond is capped at a fixed value. This is not optimal with respect to the current infrastructure capacity and for some detention ponds it might even violate current regulations set by the European Water Framework Directive. We apply formal methods to synthesize (i.e., derive automatically) a safe and optimal active controller. We model the storm water detention pond, including the urban catchment area and the rain forecasts, as a hybrid Markov decision process. Subsequently, we use the tool Uppaal Stratego to synthesize a control strategy minimizing the cost related to pollution (optimality) while guaranteeing no emergency overflow of the detention pond (safety). Simulation results for an existing pond show that Uppaal Stratego can learn optimal strategies that prevent emergency overflows, where the current static control is not always able to prevent it. At the same time, our approach can improve sedimentation during low rain periods.

show abstract

“…We use the mathematical modeling framework of hybrid Markov decision process (HMDP), adapted from Ashok et al (2019); Larsen et al (2016). Definition 1.…”

Section: Preliminariesmentioning

confidence: 99%

Section: Conclusion and Future Challengesmentioning

confidence: 99%

Learning Safe and Optimal Control Strategies for Storm Water Detention Ponds

Goorden¹,

Larsen²,

Nielsen³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…This results in formalisms called (concurrent) timed games [11,6], for which tools like UPPAAL-Tiga are available [4]. Nevertheless, the algorithmics of timed games is costly, and for instance, the mere existence of a controller is an EXPTIMEcomplete problem [10] and the resulting strategies can be very large [1]. Furthermore from a modelling point of view, often the exact timing constraints are unknown and indeed not needed to ensure the existence of a winning strategy.…”

Section: Introductionmentioning

confidence: 99%

A Turn-Based Approach for Qualitative Time Concurrent Games

Haddad

Lime

Roux

2021

Application and Theory of Petri Nets and Concurrency

View full text Add to dashboard Cite

We address concurrent games with a qualitative notion of time with parity objectives. This setting allows to express how potential controllers interact with their environment and more specifically includes relevant features: transient states where the environment will eventually act, controller avoiding of an environment action either by an immediate controller action or by masking it, etc. In order to solve the controller synthesis in this framework, we design a linear-time building of a timeless turn-based game and show a close connection between strategies of the controller in the two games. Thus we reduce the synthesis problem to a standard problem of turn-based game with parity objectives establishing as a side effect that pure memoryless strategies are enough for winning. Moreover we introduce permissiveness for safety and reachability games as a criterion to choose between winning strategies and prove that one can compute a most permissive strategy (when it exists) in linear time.

show abstract

“…However, if we do not store the controllers as lookup tables, but take advantage of decision trees (DT) [9], which exploit their hidden structure to represent them in a more compact way, we can mitigate this problem. As shown in [10], DTs can be orders of magnitude smaller than lookup tables. Such a concise representation opens the door for better readability, understandability, and explainability of the controllers, while reducing memory requirements and preserving correctness guarantees.…”

Section: Introductionmentioning

confidence: 99%

“…Alternatively, one can apply other kinds of reduction by determinization as post-processing after constructing the DT. For instance, in "safe pruning" of [10], the DT constructed for the maximally permissive controller is modified as follows. The leaves of the tree are merged in a bottom-up fashion, thereby reducing the size and partially determinizing it.…”

Section: Introductionmentioning

confidence: 99%

dtControl: Decision Tree Learning Algorithms for Controller Representation

Ashok,

Jackermeier,

Jagtap

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Decision tree learning is a popular classification technique most commonly used in machine learning applications. Recent work has shown that decision trees can be used to represent provablycorrect controllers concisely. Compared to representations using lookup tables or binary decision diagrams, decision trees are smaller and more explainable. We present dtControl, an easily extensible tool for representing memoryless controllers as decision trees. We give a comprehensive evaluation of various decision tree learning algorithms applied to 10 case studies arising out of correct-byconstruction controller synthesis. These algorithms include two new techniques, one for using arbitrary linear binary classifiers in the decision tree learning, and one novel approach for determinizing controllers during the decision tree construction. In particular the latter turns out to be extremely efficient, yielding decision trees with a single-digit number of decision nodes on 5 of the case studies.

show abstract

SOS: Safe, Optimal and Small Strategies for Hybrid Markov Decision Processes

Cited by 19 publications

References 48 publications

Learning Safe and Optimal Control Strategies for Storm Water Detention Ponds

Learning Safe and Optimal Control Strategies for Storm Water Detention Ponds

A Turn-Based Approach for Qualitative Time Concurrent Games

dtControl: Decision Tree Learning Algorithms for Controller Representation

Contact Info

Product

Resources

About