2011
DOI: 10.1017/s002190020000855x
|View full text |Cite
|
Sign up to set email alerts
|

Sensitivity Analysis in Markov Decision Processes with Uncertain Reward Parameters

Abstract: Sequential decision problems can often be modeled as Markov decision processes. Classical solution approaches assume that the parameters of the model are known. However, model parameters are usually estimated and uncertain in practice. As a result, managers are often interested in how estimation errors affect the optimal solution. In this paper we illustrate how sensitivity analysis can be performed directly for a Markov decision process with uncertain reward parameters using the Bellman equations. In particul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 20 publications
(11 reference statements)
0
4
0
Order By: Relevance
“…To this end, the weight α in the cost functional L is used as tuning parameter for the online implementation. The approach suits this problem due to the strong continuity in the cost function, which implies that the value function is continuous and therefore the approach is robust against disturbances, such as variations in the A/C settings [31]. The tradeoff weight in the MPC is defined as α ′ to distinguish between the value used in the DP:…”
Section: Resultsmentioning
confidence: 99%
“…To this end, the weight α in the cost functional L is used as tuning parameter for the online implementation. The approach suits this problem due to the strong continuity in the cost function, which implies that the value function is continuous and therefore the approach is robust against disturbances, such as variations in the A/C settings [31]. The tradeoff weight in the MPC is defined as α ′ to distinguish between the value used in the DP:…”
Section: Resultsmentioning
confidence: 99%
“…It is important to consider both of these types of changes because it may be the case that changes in a specific parameter will affect the optimal value function significantly, whereas the optimal policy may not be sensitive to these changes. Tan and Hartman [73] explore the inverse problem of determining the degree to which reward parameters can vary without the optimal policy changing.…”
Section: Quantifying Model Uncertaintymentioning
confidence: 99%
“…where we utilize the fact i∈S,a∈A βy 2 x(i, a) = βy 2 , the S-by-SA matrix A and the S-dimensional column vector b are determined by the constraint equations in (11), c = βr 2 ⊙ − r, c ′ = −2βr, r and x are SA-dimensional column vector with element r(i, a) and x(i, a), respectively. We observe that the right-hand-side of ( 12) is a parametric linear programming (PLP) (Gal and Greenberg, 1997;Tan and Hartman, 2011) with a linear parameter y. Below we do sensitivity analysis for this PLP problem.…”
Section: Lemma 2 (Critical Points) There Exists a Series Of Intervalsmentioning
confidence: 99%