Counterexample Generation for Discrete-Time Markov Models: An Introductory Survey

Ábrahám, Erika; Becker, Bernd; Dehnert, Christian; Jansen, Nils; Katoen, Joost-Pieter; Wimmer, Ralf

doi:10.1007/978-3-319-07317-0_3

Cited by 38 publications

(38 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since only with given scheduler can the MDP have probability measure, the first step is to find out the scheduler under which the specification φ is violated. As mentioned in Section II, in this case the MDP will be reduced to a DTMC and generating counterexamples in DTMC can be converted into the k shortest path problem in a directed weighted graph [18].…”

Section: A Counterexamplesmentioning

confidence: 99%

See 1 more Smart Citation

Counterexample-guided permissive supervisor synthesis for probabilistic systems through learning

Lin

2015

2015 American Control Conference (ACC)

View full text Add to dashboard Cite

Formal methods in robotic motion planning have emerged as a hot research topic recently due to its correctby-design nature, and most results haven been based on nonprobabilistic discrete models. To better handle the environment uncertainties, sensor noise and actuator imperfection, control problems in probabilistic systems like Markov Chain (MC) and Markov Decision Process (MDP) have also been studied. Most existing methods are either based on probabilistic model checking or through reinforcement learning oriented optimization. On the other hand, in the literature of supervisory control of discrete event systems, people usually design supervisors with maximum permissive nature. In other words, a collection of schedulers, instead of a single one scheduler, that satisfy the given specification is designed at the same time. We are therefore motivated to propose a novel learning based automated supervisor synthesis framework to automatically generate permissive supervisor so that the supervised system satisfies the given specification. Our approach is based on a modified L* learning algorithm and runs iteratively. It is guaranteed to be correct and terminate in finite steps.

show abstract

Section: A Counterexamplesmentioning

confidence: 99%

“…input : A DTMC model C, probabilistic specification φ output: π such that Σ N n=k+1 p n ≤ p and Σ N n=k p n > p 1 E ← {∀ω ∈ P ath Ai q0 |ω |= ϕ}; 2 P t ← Σ N n=1 P r Ai M {ω n ∈ P ath Ai q0 |ω n |= ϕ}; // Computable even if N is infinite from [18] …”

Section: Algorithm 2: Ceselect(c φ)mentioning

confidence: 99%

Counterexample-guided permissive supervisor synthesis for probabilistic systems through learning

Lin

2015

2015 American Control Conference (ACC)

View full text Add to dashboard Cite

show abstract

“…Consider a specification ϕ that is not satisfied by an MC or MDP M. One common definition of a counterexample is a (minimal) subset S ⊆ S of the state space such that the MC or sub-MDP induced by S still violates ϕ [1]. The intuition is, that by the reduced state space critical parts are highlighted.…”

Section: Specificationsmentioning

confidence: 99%

“…Recall that the agent has restricted range of vision, see Note that for a visibility distance of 2, O i is defined for 1 ≤ i ≤ 24. Consequently, an observation z = O(s) at state s is a vector z = (z (1) , . .…”

Section: B Feature Representationmentioning

confidence: 99%

See 1 more Smart Citation

Human-in-the-Loop Synthesis for Partially Observable Markov Decision Processes

Carr

Jansen

Wimmer

et al. 2018

2018 Annual American Control Conference (ACC)

Self Cite

View full text Add to dashboard Cite

We study planning problems where autonomous agents operate inside environments that are subject to uncertainties and not fully observable. Partially observable Markov decision processes (POMDPs) are a natural formal model to capture such problems. Because of the potentially huge or even infinite belief space in POMDPs, synthesis with safety guarantees is, in general, computationally intractable. We propose an approach that aims to circumvent this difficulty: in scenarios that can be partially or fully simulated in a virtual environment, we actively integrate a human user to control an agent. While the user repeatedly tries to safely guide the agent in the simulation, we collect data from the human input. Via behavior cloning, we translate the data into a strategy for the POMDP. The strategy resolves all nondeterminism and non-observability of the POMDP, resulting in a discrete-time Markov chain (MC). The efficient verification of this MC gives quantitative insights into the quality of the inferred human strategy by proving or disproving given system specifications. For the case that the quality of the strategy is not sufficient, we propose a refinement method using counterexamples presented to the human. Experiments show that by including humans into the POMDP verification loop we improve the state of the art by orders of magnitude in terms of scalability.

show abstract