2014
DOI: 10.1007/978-3-319-11936-6_8
|View full text |Cite
|
Sign up to set email alerts
|

Verification of Markov Decision Processes Using Learning Algorithms

Abstract: Abstract.We present a general framework for applying machine-learning algorithms to the verification of Markov decision processes (MDPs). The primary goal of these techniques is to improve performance by avoiding an exhaustive exploration of the state space. Our framework focuses on probabilistic reachability, which is a core property for verification, and is illustrated through two distinct instantiations. The first assumes that full knowledge of the MDP is available, and performs a heuristic-driven partial e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
245
0

Year Published

2014
2014
2018
2018

Publication Types

Select...
7
1

Relationship

2
6

Authors

Journals

citations
Cited by 152 publications
(247 citation statements)
references
References 39 publications
2
245
0
Order By: Relevance
“…Moreover, we gave results over the convergence speed, as well as criteria for obtaining exact convergence. As future works, it seems particularly interesting to test this algorithm on real instances, as it is done in [2], where authors moreover apply machine learning techniques.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Moreover, we gave results over the convergence speed, as well as criteria for obtaining exact convergence. As future works, it seems particularly interesting to test this algorithm on real instances, as it is done in [2], where authors moreover apply machine learning techniques.…”
Section: Resultsmentioning
confidence: 99%
“…Interestingly, our approach has been realized in parallel of Brázdil et al [2] that solves a different problem with similar ideas. There, authors use some machine learning algorithm, namely real-time dynamic programming, in order to avoid to apply the full operator at each step of the value iteration, but rather to partially apply it based on some statistical test.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Figure 7 illustrates the transition probability map at a velocity of 25 km/h. In this study, according to the Markov decision processes (MDPs) introduced in [26], the driving schedule was considered a finite MDP. The MDP comprises a set of state variables S = {(SOC(t), neng(t))| 0.2 ≤ SOC(t) ≤ 0.8, neng,min ≤ neng(t) ≤ neng,max}, set of actions a = {u_th(t)}, reward function r = f m  (s,a), and transition function psa, s', where psa, s' represents the probability of making a transition from state s to state s´ using action a.…”
Section: Statistic Information Of the Driving Schedulementioning
confidence: 99%
“…For the former, some SMC-like approaches have recently been developed. They either work by iteratively optimising the decisions of an explicitly-stored scheduler [4,9], or by sampling from the scheduler space and iteratively improving a set of candidate near-optimal schedulers [5]. The former are heavyweight techniques because the size of the description of the (memoryless) scheduler is significant, and in the worst case is the size of the state space.…”
Section: Introductionmentioning
confidence: 99%