1987
DOI: 10.1016/0022-247x(87)90332-5
|View full text |Cite
|
Sign up to set email alerts
|

Markov decision processes with a minimum-variance criterion

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
25
0

Year Published

2004
2004
2017
2017

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 29 publications
(25 citation statements)
references
References 7 publications
0
25
0
Order By: Relevance
“…Research in this direction was initiated by Mandl [12], Jaquette [5], [6], [8], Benito [1], and Sobel [17]. More recent extensions of these results can be found in [11], [9], and [16]. In particular, in these references the variance (or second moment) of the total expected discounted or average rewards of controlled, discretetime Markov reward chains was considered, to determine the 'best' policy within the class of discounted (or average) optimal policies and find a smaller variance (or lower second moment) of the cumulative reward.…”
Section: Motivationmentioning
confidence: 94%
See 1 more Smart Citation
“…Research in this direction was initiated by Mandl [12], Jaquette [5], [6], [8], Benito [1], and Sobel [17]. More recent extensions of these results can be found in [11], [9], and [16]. In particular, in these references the variance (or second moment) of the total expected discounted or average rewards of controlled, discretetime Markov reward chains was considered, to determine the 'best' policy within the class of discounted (or average) optimal policies and find a smaller variance (or lower second moment) of the cumulative reward.…”
Section: Motivationmentioning
confidence: 94%
“…Results can then be expected that are related to those on selecting the 'best' optimal stationary policy with smallest (minimal) variance in the discrete-time case (see, e.g. [9], [11], and [12]). However, as this 'optimization' will be notationally more complex and requires a number of technicalities and results from Markov decision theory, the details and results are left for further research.…”
Section: Remark 33 (Controlled Case)mentioning
confidence: 99%
“…To the best of our knowledge, most of the aforementioned works in MDPs focus on solving mean-variance problems in discrete-time MDPs (DTMDPs) [3,4,6,13,14,21,24,25,28,30,32] as well as in continuous-time MDPs (CTMDPs) [8,9,10,11,12,20,27], nevertheless, only a few works address mean-variance problems in semi-Markov decision processes (SMDPs); see [2,28] for finite SMDPs and [19] with a finite time horizon. Moreover, it should be noted that most of the existing works on mean-variance problems for MDPs deal with fixed finite or infinite time horizons.…”
Section: Introductionmentioning
confidence: 99%
“…The background of mean-variance problems arises from the tradeoff between the mean and variance, and the fact that a risk-aversion investor usually prefers to a return lower than the maximal one to keep a smaller variance risk. Due to this, mean-variance problems have been widely studied for various dynamic systems described by stochastic differential equations [5,7,22,31], Markov decision processes (MDPs) [2,3,8,10,13,21,27,28], and so on.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation