Optimistic Value Iteration

Hartmanns, Arnd; Kaminski, Benjamin Lucien

doi:10.1007/978-3-030-53291-8_26

Cited by 42 publications

(54 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…One way to combat these problems is to approach the solution from both directions, a technique referred to as interval iteration [15,23,58]. Storm implements the latter and additionally the more recent sound value iteration [110] and optimistic value iteration [71]. Numerical errors aside 4 , these methods ensure a correct result within a user-defined accuracy and come with a small time penalty as shown in Sect.…”

Section: Exact and Sound Model Checkingmentioning

confidence: 99%

The probabilistic model checker Storm

Hensel

Junges

Katoen

et al. 2021

Int J Softw Tools Technol Transfer

115

View full text Add to dashboard Cite

We present the probabilistic model checker Storm. Storm supports the analysis of discrete- and continuous-time variants of both Markov chains and Markov decision processes. Storm has three major distinguishing features. It supports multiple input languages for Markov models, including the Jani and Prism modeling languages, dynamic fault trees, generalized stochastic Petri nets, and the probabilistic guarded command language. It has a modular setup in which solvers and symbolic engines can easily be exchanged. Its Python API allows for rapid prototyping by encapsulating Storm’s fast and scalable algorithms. This paper reports on the main features of Storm and explains how to effectively use them. A description is provided of the main distinguishing functionalities of Storm. Finally, an empirical evaluation of different configurations of Storm on the QComp 2019 benchmark set is presented.

show abstract

Section: Exact and Sound Model Checkingmentioning

confidence: 99%

The probabilistic model checker Storm

Hensel

Junges

Katoen

et al. 2021

Int J Softw Tools Technol Transfer

115

View full text Add to dashboard Cite

show abstract

“…The maximal total reward in M can be computed using standard techniques such as value iteration and policy iteration [46] as well as the more recent sound value iteration and optimistic value iteration [48,36]. The latter two provide sound precision guarantees for the output value…”

Section: Pure Long-run Average Queriesmentioning

confidence: 99%

Multi-objective Optimization of Long-run Average and Total Rewards

Quatmann

Katoen

2021

Tools and Algorithms for the Construction and Analysis of Systems

View full text Add to dashboard Cite

This paper presents an efficient procedure for multi-objective model checking of long-run average reward (aka: mean pay-off) and total reward objectives as well as their combination. We consider this for Markov automata, a compositional model that captures both traditional Markov decision processes (MDPs) as well as a continuous-time variant thereof. The crux of our procedure is a generalization of Forejt et al.’s approach for total rewards on MDPs to arbitrary combinations of long-run and total reward objectives on Markov automata. Experiments with a prototypical implementation on top of the Storm model checker show encouraging results for both model types and indicate a substantial improved performance over existing multi-objective long-run MDP model checking based on linear programming.

show abstract

“…The supremum sup {Ex σ (lra(R w )) | σ ∈ Σ } is attained by some memoryless deterministic strategy σ w ∈ Σ md [30]. Such a strategy and the induced value v w = Ex σw (lra(R w )) can be computed (or approximated) with linear programming [30], strategy iteration [42] or value iteration [17,1] The maximal total reward in M can be computed using standard techniques such as value iteration and policy iteration [46] as well as the more recent sound value iteration and optimistic value iteration [48,36]. The latter two provide sound precision guarantees for the output value v, i.e., |v − max{Ex M ,s I σ (tot(R * )) | σ ∈ Σ M }| ≤ ε for a given ε > 0.…”

Section: Pure Long-run Average Queriesmentioning

confidence: 99%

Multi-objective Optimization of Long-run Average and Total Rewards

Quatmann¹,

Katoen²

2021

Preprint

View full text Add to dashboard Cite

This paper presents an efficient procedure for multi-objective model checking of long-run average reward (aka: mean pay-off) and total reward objectives as well as their combination. We consider this for Markov automata, a compositional model that captures both traditional Markov decision processes (MDPs) as well as a continuous-time variant thereof. The crux of our procedure is a generalization of Forejt et al.'s approach for total rewards on MDPs to arbitrary combinations of longrun and total reward objectives on Markov automata. Experiments with a prototypical implementation on top of the Storm model checker show encouraging results for both model types and indicate a substantial improved performance over existing multi-objective long-run MDP model checking based on linear programming.

show abstract

Optimistic Value Iteration

Cited by 42 publications

References 30 publications

The probabilistic model checker Storm

The probabilistic model checker Storm

Multi-objective Optimization of Long-run Average and Total Rewards

Multi-objective Optimization of Long-run Average and Total Rewards

Contact Info

Product

Resources

About