Optimized look‐ahead tree policies: a bridge between look‐ahead tree policies and direct policy search

Jung, Tobias; Wehenkel, Louis; Ernst, Damien; Maes, Francis

doi:10.1002/acs.2387

Cited by 4 publications

(18 citation statements)

References 48 publications

(71 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, OLT policies differ from LT ones in one significant way: whereas LT uses a generic node expansion heuristic, OLT relies on a parameterized node expansion heuristic exp score(n; θ) where the parameters θ are specifically optimized in an offline learning phase for the given target domain (f, ̺). The main advantage of OLT over LT is that this optimization can lead to a substantial reduction of the number of node expansions necessary to output good control actions (as was empirically demonstrated in [7], [9]), meaning that OLT can achieve the same performance as LT at a significantly lower online cost. (The disadvantage of OLT is of course that it needs this prior offline learning.)…”

Section: Optimized Look-ahead Tree Policiesmentioning

confidence: 96%

“…Having made this distinction, we can characterize DPS and LT as lying at opposite ends of the offline complexity / online complexity spectrum [7]. DPS techniques typically require huge offline resources for two reasons.…”

Section: Goal: Constrained Online Budgetmentioning

confidence: 99%

“…OLT policies as described in [7], [9] are also intended for deterministic problems with finite and small action spaces, with the nodes being also defined over states. However, OLT policies differ from LT ones in one significant way: whereas LT uses a generic node expansion heuristic, OLT relies on a parameterized node expansion heuristic exp score(n; θ) where the parameters θ are specifically optimized in an offline learning phase for the given target domain (f, ̺).…”

Section: Optimized Look-ahead Tree Policiesmentioning

confidence: 99%

“…The particular technique (or better, family of techniques) we are going to study is called optimized look-ahead tree policies (OLT) [7], [9]. OLT is a model-based hybrid technique that bridges the gap between look-ahead tree policies (LT) [3], [6] and direct policy search (DPS).…”

Section: Introductionmentioning

confidence: 99%

“…OLT was first introduced in [9], with the more comprehensive study in [7] forming the basis of the present work. Here, we extend the earlier approach in two ways:…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Optimized look-ahead trees: Extensions to large and continuous action spaces

Jung

Ernst

Maes

2013

2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)

Self Cite

View full text Add to dashboard Cite

Abstract-This paper studies look-ahead tree based control policies from the viewpoint of online decision making with constraints on the computational budget allowed per decision (expressed as number of calls to the generative model). We consider optimized look-ahead tree (OLT) policies, a recently introduced family of hybrid techniques, which combine the advantages of look-ahead trees (high precision) with the advantages of direct policy search (low online cost) and which are specifically designed for limited online budgets. We present two extensions of the basic OLT algorithm that on the one side allow tackling deterministic optimal control problems with large and continuous action spaces and that on the other side can also help to further reduce the online complexity.

show abstract

Section: Optimized Look-ahead Tree Policiesmentioning

confidence: 96%

Section: Goal: Constrained Online Budgetmentioning

confidence: 99%

Section: Optimized Look-ahead Tree Policiesmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

“…OLT was first introduced in [9], with the more comprehensive study in [7] forming the basis of the present work. Here, we extend the earlier approach in two ways:…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Optimized look-ahead trees: Extensions to large and continuous action spaces

Jung

Ernst

Maes

2013

2013 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)

Self Cite

View full text Add to dashboard Cite

show abstract

Some recent advances in learning and adaptation for uncertain feedback control systems

Wang

Lewis

2014

Adaptive Control & Signal

View full text Add to dashboard Cite

SUMMARYLearning and adaptation are essential abilities for feedback control systems to improve performance under uncertainties and external disturbances. In the past decades, there are more and more research interests in developing feedback controllers with learning abilities to ensure stability or optimality of closed-loop systems. In this guest editorial for the special issue, some recent advances in this area are introduced from three perspectives. The first one is about new developments in adaptive dynamic programming and reinforcement learning methods, which use function approximators such as neural networks to approximately solve the adaptive optimal control problem of uncertain nonlinear systems. The second perspective is related to the learning issues in adaptive control systems based on neural networks. The third perspective includes some new results to deal with uncertainties in feedback control systems based on traditional nonlinear control approaches such as multi-step nonlinear model predictive control and nonlinear H-1 control.

show abstract