2021
DOI: 10.1007/s10589-021-00278-3
|View full text |Cite
|
Sign up to set email alerts
|

Policy iteration for Hamilton–Jacobi–Bellman equations with control constraints

Abstract: Policy iteration is a widely used technique to solve the Hamilton Jacobi Bellman (HJB) equation, which arises from nonlinear optimal feedback control theory. Its convergence analysis has attracted much attention in the unconstrained case. Here we analyze the case with control constraints both for the HJB equations which arise in deterministic and in stochastic control cases. The linear equations in each iteration step are solved by an implicit upwind scheme. Numerical examples are conducted to solve the HJB eq… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 37 publications
(50 reference statements)
0
3
0
Order By: Relevance
“…Reducing the size of the constraint box, the difference between the no-gradient regression and choosing the best λ increases, confirming that the gradient cross achieves a better result for the constrained case in presence of information of both the value function and its gradient. The same example has been considered in [32]. We fix σ = 10, β = 8/3 , ρ = 2 and (x(0), y(0), z(0)) = (−1, −1, −1).…”
Section: Optimal Controlmentioning
confidence: 99%
“…Reducing the size of the constraint box, the difference between the no-gradient regression and choosing the best λ increases, confirming that the gradient cross achieves a better result for the constrained case in presence of information of both the value function and its gradient. The same example has been considered in [32]. We fix σ = 10, β = 8/3 , ρ = 2 and (x(0), y(0), z(0)) = (−1, −1, −1).…”
Section: Optimal Controlmentioning
confidence: 99%
“…Analogously to [41] one can also incorporate control constraints in terms of projection operators. The generalization of the present approach to stochastic control problems and finite horizon problems is discussed in [21,53].…”
Section: Introductionmentioning
confidence: 99%
“…8) see Fig.3.8, which illustrates that PI can be viewed as Newton's method for solving the Bellman equation in the function space of cost functions J.The interpretation of PI as a form of Newton's method has a long history, for which we refer to the original papers byKleinman [Klei68] for linear quadratic problems, and by Pollatschek and Avi-Itzhak[PoA69] for the finite-state discounted and Markov game cases. Subsequent works, which address broader classes of problems and algorithmic variations, include (among others) Hewer[Hew71], Puterman and Brumelle[PuB78],[PuB79], Saridis and Lee[SaL79] (following Rekasius[Rek64]), Beard[Bea95], Beard, Saridis, and Wen [BSW99], Santos and Rust[SaR04], Bokanowski, Maroso, and Zidani [BMZ09], Hylla[Hyl11], Magirou, Vassalos, and Barakitis [MVB20], Bertsekas[Ber21c], and Kundu and Kunitsch[KuK21]. Some of these papers include superlinear convergence rate results.RolloutGenerally, rollout with base policy µ can be viewed as a single iteration of Newton's method starting from J µ , as applied to the solution of the Bellman equation (see Fig.3.8).…”
mentioning
confidence: 99%