2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2019
DOI: 10.1109/iros40897.2019.8967913
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Reinforcement Learning for Quadruped Locomotion

Abstract: Legged locomotion is a challenging task for learning algorithms, especially when the task requires a diverse set of primitive behaviors. To solve these problems, we introduce a hierarchical framework to automatically decompose complex locomotion tasks. A high-level policy issues commands in a latent space and also selects for how long the low-level policy will execute the latent command. Concurrently, the low-level policy uses the latent command and only the robot's on-board sensors to control the robot's actu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
26
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 40 publications
(26 citation statements)
references
References 14 publications
0
26
0
Order By: Relevance
“…While encouraging results have been achieved using Model Predictive Control (MPC) and trajectory optimization [24,10,18,9,19,26,4,75], these methods require in-depth knowledge of the environment and substantial efforts in manual parameter tuning, which makes these methods challenging to apply to complex environments. Alternatively, model-free RL can learn general policies for tasks with challenging terrain [43,90,53,63,64,77,35,46,85,36,38,84,44]. For example, Xie et al [85] introduce to use dynamics randomization to generalize RL locomotion policy in different environments, and Peng et al [64] use animal videos to provide demonstrations for imitation learning.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…While encouraging results have been achieved using Model Predictive Control (MPC) and trajectory optimization [24,10,18,9,19,26,4,75], these methods require in-depth knowledge of the environment and substantial efforts in manual parameter tuning, which makes these methods challenging to apply to complex environments. Alternatively, model-free RL can learn general policies for tasks with challenging terrain [43,90,53,63,64,77,35,46,85,36,38,84,44]. For example, Xie et al [85] introduce to use dynamics randomization to generalize RL locomotion policy in different environments, and Peng et al [64] use animal videos to provide demonstrations for imitation learning.…”
Section: Related Workmentioning
confidence: 99%
“…Finally, we combine both features for policy action prediction. The resulting model is trained end-to-end directly using rewards, without hierarchical RL [62,41,31,38] nor pre-defined controllers [15,21].…”
Section: Introductionmentioning
confidence: 99%
“…Hierarchical Reinforcement Learning (HRL). [9], [10] is widely used for robotic tasks, such as manipulation [5], navigation [6], and locomotion [26]. In this work, we refrain from learning the low-level policy parameters, hence our method is not HRL-based.…”
Section: Related Workmentioning
confidence: 99%
“…Hierarchical Reinforcement Learning. Hierarchical Reinforcement Learning (HRL) [25], [26] has also proven to be an effective tool in tackling the locomotion problem. Furthermore, researchers [27] have developed HRL strategies for controlling legged characters in simulation, which have been further combined with adversarial learning [28] to achieve high-level control.…”
Section: Related Workmentioning
confidence: 99%