2015
DOI: 10.1109/tac.2014.2343831
|View full text |Cite
|
Sign up to set email alerts
|

Asymptotic Optimality and Rates of Convergence of Quantized Stationary Policies in Stochastic Control

Abstract: Abstract-We consider the discrete approximation of stationary policies for a discrete-time Markov decision process with Polish state and action spaces under total, discounted, and average cost criteria. Deterministic stationary quantizer policies are introduced and shown to be able to approximate optimal deterministic stationary policies with arbitrary precision under mild technical conditions, thus demonstrating that one can search for ε-optimal policies within the class of quantized control policies. We also… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 22 publications
(21 citation statements)
references
References 20 publications
0
21
0
Order By: Relevance
“…However, it still requires the Lipschitz continuity of some components of dynamic programs. Unlike the aforementioned approaches, the finite-state and finite-action approximation method for MDPs with σ-compact state spaces proposed by Saldi et al does not rely on Lipschitz-type continuity conditions [15,16].…”
Section: Related Workmentioning
confidence: 99%
“…However, it still requires the Lipschitz continuity of some components of dynamic programs. Unlike the aforementioned approaches, the finite-state and finite-action approximation method for MDPs with σ-compact state spaces proposed by Saldi et al does not rely on Lipschitz-type continuity conditions [15,16].…”
Section: Related Workmentioning
confidence: 99%
“…where ϕ n is the optimal policy for (CP k−ε k 1 n ) obtained by extending the optimal policy ϕ n of (CP k−ε k 1 n ) to X, i.e., ϕ n ( · |x) = ϕ( · |Q n (x)). Here, (13) follows from Theorem 1; (14) and (15) follow from Proposition 2. We observe that (15) implies that ϕ n is feasible for (CP k ), and furthermore, by (12), (13), and (14), the true cost of ϕ n is within κ of the optimal value of (CP k ), i.e., J(ϕ n , γ) − min(CP) < κ.…”
Section: B Asymptotic Approximation Of Optimal Policymentioning
confidence: 99%
“…Bounds on the error or convergence rates for the approximating control model can be derived depending on the regularity hypotheses made on the parameters of the model. There exists a huge literature related to that approach: see, among others, [1,2,7,8,12,14,16,24,27] and the references therein. To some extent, our approach here (for CTMDPs) is related to the references [1,Chapter 17] and [7] on DTMDPs.…”
Section: Motivation and Contributionmentioning
confidence: 99%