2011
DOI: 10.3182/20110828-6-it-1002.03729
|View full text |Cite
|
Sign up to set email alerts
|

Aggregation Methods for Lineary-solvable Markov Decision Process

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(12 citation statements)
references
References 2 publications
0
12
0
Order By: Relevance
“…Linearly-solvable optimal control is a rich mathematical framework that has recently received a lot of attention, following Kappen's work on control-affine diffusions in continuous time [14], and our work on Markov decision processes in discrete time [27]. Both groups have since then obtained many additional results: see [36,17,6,5] and [28,31,30,29,8,32,33,9,38,39] respectively. Other groups have also started to use and further develop this framework [35,7,24,25].…”
Section: Historical Perspectivementioning
confidence: 99%
“…Linearly-solvable optimal control is a rich mathematical framework that has recently received a lot of attention, following Kappen's work on control-affine diffusions in continuous time [14], and our work on Markov decision processes in discrete time [27]. Both groups have since then obtained many additional results: see [36,17,6,5] and [28,31,30,29,8,32,33,9,38,39] respectively. Other groups have also started to use and further develop this framework [35,7,24,25].…”
Section: Historical Perspectivementioning
confidence: 99%
“…Recently, a new framework of linearly solvable Markov decision process (LMDP) has been proposed, in which a nonlinear Hamilton-Jacobi-Bellman (HJB) equation for continuous systems or Bellman equation for discrete systems is converted into a linear equation under certain assumptions on the action cost and the effect action on the state dynamics [6], [7]. In this approach, an exponentially transformed state value function is defined as a desirability function and it is derived from the linearized Bellman's equation by solving an eigenvalue problem [8] or an eigenfunction problem [9], [10]. One of the benefits is its compositionality.…”
Section: Introductionmentioning
confidence: 99%
“…However, an additional learning is needed when a new initial state or a new goal state is given. In the value-based approach, an exponentially transformed state value function is defined as the desirability function and it is derived from the linearized Bellman's equation by solving an eigenvalue problem (Todorov, 2007) or an eigenfunction problem (Todorov, 2009c; Zhong and Todorov, 2011). One of the benefits of the desirability function approach is its compositionality.…”
Section: Introductionmentioning
confidence: 99%