2020
DOI: 10.1007/978-3-030-61380-8_8
|View full text |Cite
|
Sign up to set email alerts
|

On the Performance of Planning Through Backpropagation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…In the past decade, deep learning (DL) methods have demonstrated remarkable success in a variety of complex applications in computer vision, natural language, and signal processing (Krizhevsky, Sutskever, and Hinton 2017;Hinton et al 2012;Bengio, Lecun, and Hinton 2021). More recently, a variety of work has sought to leverage DL tools for planning and policy learning in a large variety of deterministic and stochastic decision-making domains (Wu, Say, and Sanner 2017;Wu, Say, and Sanner 2020;Say et al 2020;Scaroni et al 2020;Say 2021;Toyer et al 2020;Garg, Bajpai, and Mausam 2020).…”
Section: Introductionmentioning
confidence: 99%
“…In the past decade, deep learning (DL) methods have demonstrated remarkable success in a variety of complex applications in computer vision, natural language, and signal processing (Krizhevsky, Sutskever, and Hinton 2017;Hinton et al 2012;Bengio, Lecun, and Hinton 2021). More recently, a variety of work has sought to leverage DL tools for planning and policy learning in a large variety of deterministic and stochastic decision-making domains (Wu, Say, and Sanner 2017;Wu, Say, and Sanner 2020;Say et al 2020;Scaroni et al 2020;Say 2021;Toyer et al 2020;Garg, Bajpai, and Mausam 2020).…”
Section: Introductionmentioning
confidence: 99%
“…In the past decade, deep learning (DL) methods have demonstrated remarkable success in a variety of complex applications in computer vision, natural language, and signal processing (Krizhevsky, Sutskever, and Hinton 2017;Hinton et al 2012;Bengio, Lecun, and Hinton 2021). More recently, a variety of work has sought to leverage DL tools for planning and policy learning in a large variety of deterministic and stochastic decision-making domains (Wu, Say, and Sanner 2017;Bueno et al 2019;Wu, Say, and Sanner 2020;Say et al 2020;Scaroni et al 2020;Say 2021;Toyer et al 2020;Garg, Bajpai, and Mausam 2020).…”
Section: Introductionmentioning
confidence: 99%
“…However, a recent direction of significant influence on the present work is the use of automatic differentiation in an end-to-end model-based gradient descent framework to leverage recent advances in non-convex optimization from DL. The majority of work in this direction has focused on deterministic continuous planning models -both known (Wu, Say, and Sanner 2017;Scaroni et al 2020) and learned (Wu, Say, and Sanner 2020;Say 2021). However, in this work we are specifically concerned with learning deep reactive policies (DRPs) for fast decision-making in general continuous state-action MDPs (CSA-MDPs).…”
Section: Introductionmentioning
confidence: 99%
“…In addition, we proposed a different formulation planning through backpropagation as trajectory optimization thus making clear the distinction between learning internal representations in Recurrent Neural Networks (RNNs) (Goodfellow et al, 2016) and optimizing trajectories (either action trajectories, i.e., plans, in the shooting formulation of Definition 4.1.2 or state-action trajectories in direct transcription of Definition 4.1.3). Preliminary results of our formulation and an analysis of the optimality gap has been recently published (Scaroni et al, 2020).…”
Section: Discussionmentioning
confidence: 99%
“…Inspired by the recurrent computation of Recurrent Neural Networks (RNN), TensorPlan leverages the backpropagation-through-time technique to optimize the model inputs (i.e., the agent's actions) instead of the internal neural representations. In this thesis, we reinterpret TensorPlan and propose to formulate planning through backpropagation as trajectory optimization (Scaroni et al, 2020). We remark that this reinterpretation makes interesting connections to control theory and expands the understanding of differentiable planning as general gradient-based methods that can be built on top of different optimization formulations which may lead to novel algorithms in the future.…”
Section: Thesis Proposal and Contributionsmentioning
confidence: 99%