2017 IEEE 56th Annual Conference on Decision and Control (CDC) 2017
DOI: 10.1109/cdc.2017.8264202
|View full text |Cite
|
Sign up to set email alerts
|

Point-wise maximum approach to approximate dynamic programming

Abstract: We describe an approximate dynamic programming approach to compute lower bounds on the optimal value function for a discrete time, continuous space, infinite horizon setting. The approach iteratively constructs a family of lower bounding approximate value functions by using the so-called Bellman inequality. The novelty of our approach is that, at each iteration, we aim to compute an approximate value function that maximizes the point-wise maximum taken with the family of approximate value functions computed th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2

Relationship

5
1

Authors

Journals

citations
Cited by 6 publications
(36 citation statements)
references
References 32 publications
(78 reference statements)
0
36
0
Order By: Relevance
“…The dynamics f (x, u) take one of the following input-affine forms: (a) If in (7) we have R kl 0 for each index kl, then…”
Section: Restriction Of Problem Classmentioning
confidence: 99%
“…The dynamics f (x, u) take one of the following input-affine forms: (a) If in (7) we have R kl 0 for each index kl, then…”
Section: Restriction Of Problem Classmentioning
confidence: 99%
“…The benefit of a pointwise maximum combination is empirically demonstrated in [20] for a simple example, with the set of state-relevance weighting parameters hand-picked using problem-specific insight. In our previous work [27], we proposed a problem formulation with the point-wise maximum combination used in the Bellman inequality. The formulation was used to develop an iterative algorithm for computing lower bounding approximate value functions, however, the quality of the approximation, comparable with that of [20], still relies on the designer choosing a sequence of state-relevance weightings.…”
Section: B Prior Workmentioning
confidence: 99%
“…We propose using a gradient ascent algorithm to address the non-convex point-wise maximum objective, and combine this with the algorithm proposed in [27] for computing a family of approximate value function whose point-wise maximum combination satisfies the Bellman inequality. The benefits of gradient ascent in this setting are two fold: 1) At each iteration of the gradient ascent algorithm the objective function is linear in the coefficients of the approximate value function and hence the computation requirements are comparable with existing methods; 2) The computation of a gradient direction has the interpretation of reducing the support of the state-relevance weighting distribution to a region of the state space that is relevant for the current iteration.…”
Section: Contributions and Outlinementioning
confidence: 99%
“…To improve the quality of the approximate value function, we use the approach proposed in [11] that solves a sequence of optimization problems, each with constraints of the same size as (9).…”
Section: Point-wise Maximum Approach To Adpmentioning
confidence: 99%
“…The steps given in [11] show how to reformulate (10c) as a polynomial inequality constraint similar to (7). The SOS S-Procedure is then applied and the resulting relaxation involves one LMI constraint with the same size as (9a), and j − 1 LMI constraints identical to (9b).…”
Section: Point-wise Maximum Approach To Adpmentioning
confidence: 99%