Asymptotic Optimality and Rates of Convergence of Quantized Stationary Policies in Stochastic Control

Saldi, Naci; Linder, Tamás; Yüksel, Serdar

doi:10.1109/tac.2014.2343831

Cited by 22 publications

(21 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, it still requires the Lipschitz continuity of some components of dynamic programs. Unlike the aforementioned approaches, the finite-state and finite-action approximation method for MDPs with σ-compact state spaces proposed by Saldi et al does not rely on Lipschitz-type continuity conditions [15,16].…”

Section: Related Workmentioning

confidence: 99%

Stochastic Subgradient Methods for Dynamic Programming in Continuous State and Action Spaces

Jang

Yang

2019

2019 IEEE 58th Conference on Decision and Control (CDC)

View full text Add to dashboard Cite

A convex optimization-based method is proposed to numerically solve dynamic programs in continuous state and action spaces. This approach using a discretization of the state space has the following salient features. First, by introducing an auxiliary optimization variable that assigns the contribution of each grid point, it does not require an interpolation in solving an associated Bellman equation and constructing a control policy. Second, the proposed method approximates the optimal value function via convex programming with a uniform convergence property in the case of convex optimal value functions. We also propose a design method for a control policy of which performance converges uniformly to the optimum as the grid resolution becomes finer in this case. Third, when a nonlinear control-affine system is considered, the convex optimization approach provides an approximate control policy with a provable suboptimality bound. Fourth, for general cases, the proposed convex formulation of dynamic programming operators can be simply modified as a nonconvex bi-level program, in which the inner problem is a linear program, without losing uniform convergence properties if a globally optimal solution to this bi-level program can be found. From our convex methods and analyses, we observe that convexity in dynamic programming deserves attention as it can play a critical role in obtaining a tractable and convergent numerical solution.

show abstract

Section: Related Workmentioning

confidence: 99%

Stochastic Subgradient Methods for Dynamic Programming in Continuous State and Action Spaces

Jang

Yang

2019

2019 IEEE 58th Conference on Decision and Control (CDC)

View full text Add to dashboard Cite

show abstract

“…where ϕ n is the optimal policy for (CP k−ε k 1 n ) obtained by extending the optimal policy ϕ n of (CP k−ε k 1 n ) to X, i.e., ϕ n ( · |x) = ϕ( · |Q n (x)). Here, (13) follows from Theorem 1; (14) and (15) follow from Proposition 2. We observe that (15) implies that ϕ n is feasible for (CP k ), and furthermore, by (12), (13), and (14), the true cost of ϕ n is within κ of the optimal value of (CP k ), i.e., J(ϕ n , γ) − min(CP) < κ.…”

Section: B Asymptotic Approximation Of Optimal Policymentioning

confidence: 99%

Finite-State Approximations to Discounted and Average Cost Constrained Markov Decision Processes

Saldi

2019

IEEE Trans. Automat. Contr.

Self Cite

View full text Add to dashboard Cite

In this paper, we consider the finite-state approximation of a discrete-time constrained Markov decision process (MDP) under the discounted and average cost criteria. Using the linear programming formulation of the constrained discounted cost problem, we prove the asymptotic convergence of the optimal value of the finite-state model to the optimal value of the original model. With further continuity condition on the transition probability, we also establish a method to compute approximately optimal policies. For the average cost, instead of using the finite-state linear programming approximation method, we use the original problem definition to establish the finitestate asymptotic approximation of the constrained problem and compute approximately optimal policies. Under Lipschitz type regularity conditions on the components of the MDP, we also obtain explicit rate of convergence bounds quantifying how the approximation improves as the size of the approximating finite state space increases.

show abstract

“…Bounds on the error or convergence rates for the approximating control model can be derived depending on the regularity hypotheses made on the parameters of the model. There exists a huge literature related to that approach: see, among others, [1,2,7,8,12,14,16,24,27] and the references therein. To some extent, our approach here (for CTMDPs) is related to the references [1,Chapter 17] and [7] on DTMDPs.…”

Section: Motivation and Contributionmentioning

confidence: 99%

Computable approximations for continuous-time Markov decision processes on Borel spaces based on empirical measures

Anselmi

Dufour²,

Prieto-Rumeau³

2016

Journal of Mathematical Analysis and Applications

View full text Add to dashboard Cite

In this paper, we propose an approach for approximating the value function and an -optimal policy of continuous-time Markov decision processes with Borel state and action spaces, with possibly unbounded cost and transition rates, under the total expected discounted cost optimality criterion. Under adequate assumptions, which in particular include that the transition rate has a density function with respect to a reference measure, together with piecewise Lipschitz continuity of the elements of the control model, we approximate the original controlled process by a model with finite state and action spaces. The approximation error is related to the 1-Wasserstein distance between suitably defined probability measures and approximating measures with finite support. We also study the case when the reference measure is approximated with empirical distributions and we show that convergence of the approximations takes place at an exponential rate in probability.

show abstract

Asymptotic Optimality and Rates of Convergence of Quantized Stationary Policies in Stochastic Control

Cited by 22 publications

References 20 publications

Stochastic Subgradient Methods for Dynamic Programming in Continuous State and Action Spaces

Stochastic Subgradient Methods for Dynamic Programming in Continuous State and Action Spaces

Finite-State Approximations to Discounted and Average Cost Constrained Markov Decision Processes

Computable approximations for continuous-time Markov decision processes on Borel spaces based on empirical measures

Contact Info

Product

Resources

About