2020
DOI: 10.48550/arxiv.2008.11867
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Planning in Learned Latent Action Spaces for Generalizable Legged Locomotion

Abstract: Hierarchical learning has been successful at learning generalizable locomotion skills on walking robots in a sample-efficient manner. However, the lowdimensional "latent" action used to communicate between different layers of the hierarchy is typically user-designed. In this work, we present a fully-learned hierarchical framework, that is capable of jointly learning the low-level controller and the high-level action space. Next, we plan over latent actions in a model-predictive control fashion, using a learned… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 27 publications
0
3
0
Order By: Relevance
“…Our work has some parallels with model-based RL, as we do not rely on a full physics simulator or a physical robot to perform Monte Carlo rollouts. Recent work using model-based RL for legged robot control still needs to collect simulation data or physical robot data to learn either a transition model that contains the full state information [26] or center of mass (COM) information [27]. We rely instead on the simple centroidal dynamics to perform Monte Carlo rollouts, which does not require data from a full model or a physical robot.…”
Section: B Deep Reinforcement Learning For Quadrupedal Robotsmentioning
confidence: 99%
“…Our work has some parallels with model-based RL, as we do not rely on a full physics simulator or a physical robot to perform Monte Carlo rollouts. Recent work using model-based RL for legged robot control still needs to collect simulation data or physical robot data to learn either a transition model that contains the full state information [26] or center of mass (COM) information [27]. We rely instead on the simple centroidal dynamics to perform Monte Carlo rollouts, which does not require data from a full model or a physical robot.…”
Section: B Deep Reinforcement Learning For Quadrupedal Robotsmentioning
confidence: 99%
“…From a design perspective, HRL allows for separate acquisition of low-level policies (options; skills), which can dramatically accelerate learning on downstream tasks. A variety of works propose the discovery of such low-level primitives from random walks [26,52], mutual information objectives [17,13,43,6], datasets of agent or expert traces [37,2,25], motion capture data [36,31,42], or from dedicated pre-training tasks [15,27].…”
Section: Related Workmentioning
confidence: 99%
“…We present z i as scalar variables for clarity, but they can be generalized to higher-dimensional encoders, whose parameters and inputs are optimized using gradient descent and z i becomes the learned, fixed output of the z encoder. Such learned embeddings have been explored in literature before, for example in [20], [22], [18], [10]. However, none of these works generalize to multiple robot morphologies or use a high-dimensional image input as state.…”
Section: B Navigation Policies With Robot-specific Embeddingmentioning
confidence: 99%