2022
DOI: 10.48550/arxiv.2205.04667
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Variational Inference MPC using Normalizing Flows and Out-of-Distribution Projection

Abstract: We propose a Model Predictive Control (MPC) method for collision-free navigation that uses amortized variational inference to approximate the distribution of optimal control sequences by training a normalizing flow conditioned on the start, goal and environment. This representation allows us to learn a distribution that accounts for both the dynamics of the robot and complex obstacle geometries. We can then sample from this distribution to produce control sequences which are likely to be both goal-directed and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 21 publications
0
2
0
Order By: Relevance
“…On the other hand, modern learning techniques such as neural networks can learn an embedding of a larger set of parameters that maps to the solutions [16]. One advantage of using neural network methods is the capability to generalize to out-of-distribution situations that are not included in the training set [29].…”
Section: Related Work a Parametric Programmingmentioning
confidence: 99%
“…On the other hand, modern learning techniques such as neural networks can learn an embedding of a larger set of parameters that maps to the solutions [16]. One advantage of using neural network methods is the capability to generalize to out-of-distribution situations that are not included in the training set [29].…”
Section: Related Work a Parametric Programmingmentioning
confidence: 99%
“…Learning a prior distribution of actions and subgoals has been used to speed up MPC and accomplish complex tasks. Power and Berenson [25] leverage normalizing flow for modeling the action distributions. Wang and Ba [26] use a policy network to initialize the action sequences for MPC.…”
Section: Mpc With a Learned Priormentioning
confidence: 99%
“…Of particular relevance to our framework are methods that combine principled control strategies with learned components in a hierarchical way. Examples include using LQR control in the inner problem with learnable cost and dynamics (Tamar et al, 2017;Amos et al, 2018;Agrawal et al, 2019b), learning sampling distributions in planning and control (Ichter et al, 2018;Power & Berenson, 2022;Amos & Yarats, 2020), or learning optimization strategies or goals for optimization-based control (Sacks & Boots, 2022;Xiao et al, 2022;Metz et al, 2019;2022;Lew et al, 2022).…”
Section: Related Workmentioning
confidence: 99%