Proceedings of the 48h IEEE Conference on Decision and Control (CDC) Held Jointly With 2009 28th Chinese Control Conference 2009
DOI: 10.1109/cdc.2009.5399685
|View full text |Cite
|
Sign up to set email alerts
|

Approximate dynamic programming using fluid and diffusion approximations with applications to power management

Abstract: Neuro-dynamic programming is a class of powerful techniques for approximating the solution to dynamic programming equations. In their most computationally attractive formulations, these techniques provide the approximate solution only within a prescribed finite-dimensional function class. Thus, the question that always arises is how should the function class be chosen? The goal of this paper is to propose an approach using the solutions to associated fluid and diffusion approximations. In order to illustrate t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
37
0

Year Published

2009
2009
2017
2017

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 25 publications
(39 citation statements)
references
References 27 publications
1
37
0
Order By: Relevance
“…This construction is useful for establishing properties of the relative value function in the following result. Similar convexity results can be found for models in the queueing literature (see [4], [6], [11]). …”
Section: Optimality Equationssupporting
confidence: 67%
See 1 more Smart Citation
“…This construction is useful for establishing properties of the relative value function in the following result. Similar convexity results can be found for models in the queueing literature (see [4], [6], [11]). …”
Section: Optimality Equationssupporting
confidence: 67%
“…Our approach is related to the fluid-model approximations for value functions in [2]- [4], previously applied to approximate dynamic programming approaches of [5]. Our approach is more similar to the recent work [6] in which the approximation to the ACOE is obtained through Taylor series approximations. We obtain approximate solutions to the resultant first order ODE, which in turn yield basis functions useful for LSTD-learning.…”
Section: Introductionmentioning
confidence: 81%
“…This choice of optimality criterion is motivated by the fact that the value function J * approximates the relative value function for an associated average-cost optimization problem for a stochastic model [8,30,31]. However, computation of J * is infeasible in all but the simplest models.…”
Section: Fluid Modelsmentioning
confidence: 99%
“…In fact, the estimation of cost-to-go functions in traditional online learning (e.g., Q-learning and Reinforcement learning [10][11][12]) requires sufficient observation of a sample-path such that it hits all the states of the FSM a large number of times. Approximations of cost-to-go functions [13,14] are generally based on oversimplified models and thus cannot be accurately used in general practical networks. For instance, the fluid approximation proposed in [14] is based on the assumption that the cost-to-go function is smooth in the state space of the FSM, meaning that only small variations of its value computed in neighboring states are allowed.…”
Section: Introductionmentioning
confidence: 99%
“…Approximations of cost-to-go functions [13,14] are generally based on oversimplified models and thus cannot be accurately used in general practical networks. For instance, the fluid approximation proposed in [14] is based on the assumption that the cost-to-go function is smooth in the state space of the FSM, meaning that only small variations of its value computed in neighboring states are allowed. This assumption is suitable for simple cases (e.g., buffer models and cost functions modeling buffer congestion), but does not hold for more complex FSM models and general cost functions.…”
Section: Introductionmentioning
confidence: 99%