Simple Strategies in Multi-Objective MDPs

Delgrange, Florent; Katoen, Joost-Pieter; Quatmann, Tim; Randour, Mickaël

doi:10.1007/978-3-030-45190-5_19

Cited by 22 publications

(22 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• A comprehensive overview of strategy (Section 3, Table 1) and computational complexity (Section 4, Table 2) of disjunctive reachability-safety queries in stochastic games, significantly extending previous results from the literature [19,27,35]. In particular, motivated by the observation that randomized strategies are undesirable or meaningless for certain applications (e.g., medical or product design [23]), we study the setting of DQs under deterministic strategies for both players. Notably, this lead to rather high complexities: Qualitative queries are PSPACE-hard and quantitative reachability is even undecidable.…”

Section: Contributions and Overview In Summary This Paper Makes The Following Contributionsmentioning

confidence: 81%

Stochastic Games with Disjunctions of Multiple Objectives

Winkler

Weininger

2021

Electron. Proc. Theor. Comput. Sci.

View full text Add to dashboard Cite

Stochastic games combine controllable and adversarial non-determinism with stochastic behavior and are a common tool in control, verification and synthesis of reactive systems facing uncertainty. Multi-objective stochastic games are natural in situations where several-possibly conflictingperformance criteria like time and energy consumption are relevant. Such conjunctive combinations are the most studied multi-objective setting in the literature. In this paper, we consider the dual disjunctive problem. More concretely, we study turn-based stochastic two-player games on graphs where the winning condition is to guarantee at least one reachability or safety objective from a given set of alternatives. We present a fine-grained overview of strategy and computational complexity of such disjunctive queries (DQs) and provide new lower and upper bounds for several variants of the problem, significantly extending previous works. We also propose a novel value iteration-style algorithm for approximating the set of Pareto optimal thresholds for a given DQ.

show abstract

Section: Contributions and Overview In Summary This Paper Makes The Following Contributionsmentioning

confidence: 81%

Stochastic Games with Disjunctions of Multiple Objectives

Winkler

Weininger

2021

Electron. Proc. Theor. Comput. Sci.

View full text Add to dashboard Cite

show abstract

“…Executing multi-objective model checking on MDPs for the synthesis of Pareto-optimal policies is an important and non-trivial problem [14]. Despite recent advances [13], [15], [16], [17], [18], existing approaches either use simple iterative methods, or rely on reductions and simplifications to solve the problem using linear programming. This limits their applicability to (i) single-objective problems with multiple strict constraints (for which a single best policy exists); or (ii) unconstrained problems with up to three optimisation objectives.…”

Section: Introductionmentioning

confidence: 99%

Evolutionary-Guided Synthesis of Verified Pareto-Optimal MDP Policies

Gerasimou

Moreno

Călinescu

et al. 2021

2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)

View full text Add to dashboard Cite

We present a new approach for synthesising Paretooptimal Markov decision process (MDP) policies that satisfy complex combinations of quality-of-service (QoS) software requirements. These policies correspond to optimal designs or configurations of software systems, and are obtained by translating MDP models of these systems into parametric Markov chains, and using multi-objective genetic algorithms to synthesise Pareto-optimal parameter values that define the required MDP policies. We use case studies from the service-based systems and robotic control software domains to show that our MDP policy synthesis approach can handle a wide range of QoS requirement combinations unsupported by current probabilistic model checkers. Moreover, for requirement combinations supported by these model checkers, our approach generates better Pareto-optimal policy sets according to established quality metrics.

show abstract

“…Multi-objective MDP Various types of objectives known from conventionalsingle-objective-model checking have been lifted to the multi-objective case. These objectives range over ω-regular specifications including LTL [26,27], expected (discounted and non-discounted) total rewards [21,27,28,52,22], stepbounded and reward-bounded reachability probabilities [28,35], and-most relevant for this work-expected long-run average (LRA) rewards [18,11,20], also known as mean pay-offs. For the latter, all current approaches build upon linear programming (LP) which yields a theoretical time-complexity polynomial in the model size.…”

Section: Introductionmentioning

confidence: 99%

Multi-objective Optimization of Long-run Average and Total Rewards

Quatmann

Katoen

2021

Tools and Algorithms for the Construction and Analysis of Systems

Self Cite

View full text Add to dashboard Cite

This paper presents an efficient procedure for multi-objective model checking of long-run average reward (aka: mean pay-off) and total reward objectives as well as their combination. We consider this for Markov automata, a compositional model that captures both traditional Markov decision processes (MDPs) as well as a continuous-time variant thereof. The crux of our procedure is a generalization of Forejt et al.’s approach for total rewards on MDPs to arbitrary combinations of long-run and total reward objectives on Markov automata. Experiments with a prototypical implementation on top of the Storm model checker show encouraging results for both model types and indicate a substantial improved performance over existing multi-objective long-run MDP model checking based on linear programming.

show abstract

Simple Strategies in Multi-Objective MDPs

Cited by 22 publications

References 38 publications

Stochastic Games with Disjunctions of Multiple Objectives

Stochastic Games with Disjunctions of Multiple Objectives

Evolutionary-Guided Synthesis of Verified Pareto-Optimal MDP Policies

Multi-objective Optimization of Long-run Average and Total Rewards

Contact Info

Product

Resources

About