1994
DOI: 10.1002/9780470316887
|View full text |Cite
|
Sign up to set email alerts
|

Markov Decision Processes

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
1,148
0
20

Year Published

1997
1997
2014
2014

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 5,046 publications
(1,170 citation statements)
references
References 2 publications
(2 reference statements)
2
1,148
0
20
Order By: Relevance
“…Under mild conditions [1], the algorithm can be shown to converge to the solution of Equation (3), thus yielding both the optimal policy * π and its associated average cost *  .…”
Section: The Value Iteration Algorithmmentioning
confidence: 99%
See 3 more Smart Citations
“…Under mild conditions [1], the algorithm can be shown to converge to the solution of Equation (3), thus yielding both the optimal policy * π and its associated average cost *  .…”
Section: The Value Iteration Algorithmmentioning
confidence: 99%
“…Under general conditions [1], each control policy π   implies a finite long term cost π  . The task of the decision maker is to identify a policy π   that minimizes the long term average cost, thus satisfying the expression below:…”
Section: Average Cost Markov Decision Processesmentioning
confidence: 99%
See 2 more Smart Citations
“…Moreover, under the axiom of non-satiation, the consumer will spend all his wealth in the last period of his life span and therefore 1 0 T W + = . The problem (2)- (3) is a discrete-time stochastic control problem (see [10,11]). Now consider the case when the consumer has lived to period t and his wealth is W .…”
Section: Utility Maximization Under Random Life Span and Uncertain Inmentioning
confidence: 99%