2019
DOI: 10.48550/arxiv.1902.05542
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Unsupervised Visuomotor Control through Distributional Planning Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 0 publications
0
10
0
Order By: Relevance
“…The idea is to learn a forward model of the world, which forecasts the outcome of an action. In the case of robot control, a popular approach is to learn the state-action transition models in a latent feature embedding space, which are further used for motion planning [8], [9], [10]. Likewise, visual foresight [11] leverages a deep video prediction model to plan the end-effector motion by sampling actions leading to a state which approximates the goal image.…”
Section: Related Workmentioning
confidence: 99%
“…The idea is to learn a forward model of the world, which forecasts the outcome of an action. In the case of robot control, a popular approach is to learn the state-action transition models in a latent feature embedding space, which are further used for motion planning [8], [9], [10]. Likewise, visual foresight [11] leverages a deep video prediction model to plan the end-effector motion by sampling actions leading to a state which approximates the goal image.…”
Section: Related Workmentioning
confidence: 99%
“…In low-dimensional tasks, one can simply take the reward to be the negative 2 -distance in the state space (Andrychowicz et al, 2017). However, defining distance metrics is more challenging in high-dimensional spaces, such as images (Yu et al, 2019). Prior work on visual goal-conditioned RL (Nair et al, 2018;Pong et al, 2019) train an additional state representation model, such as a VAE encoder e VAE : S → Z VAE .…”
Section: Goal-conditioned Reinforcement Learningmentioning
confidence: 99%
“…High-dimensional non-convex optimization problems that have a lot of structure in the solution space naturally arise in the control setting where the controller seeks to optimize the same objective in the same controller dynamical system from different starting states. This has been investigated in, e.g., planning (Ichter et al, 2018;Ichter & Pavone, 2019;Mukadam et al, 2018;Kurutach et al, 2018;Srinivas et al, 2018;Yu et al, 2019;Lynch et al, 2019), and policy distillation (Wang & Ba, 2019). Chandak et al (2019) shows how to learn an action space for model-free learning and Co-Reyes et al (2018); Antonova et al (2019) embed action sequences with a VAE.…”
Section: Rl and Controlmentioning
confidence: 99%