Hierarchical Reinforcement Learning for Quadruped Locomotion

Jain, Deepali; İşçen, Atıl; Caluwaerts, Ken

doi:10.1109/iros40897.2019.8967913

Cited by 40 publications

(26 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While encouraging results have been achieved using Model Predictive Control (MPC) and trajectory optimization [24,10,18,9,19,26,4,75], these methods require in-depth knowledge of the environment and substantial efforts in manual parameter tuning, which makes these methods challenging to apply to complex environments. Alternatively, model-free RL can learn general policies for tasks with challenging terrain [43,90,53,63,64,77,35,46,85,36,38,84,44]. For example, Xie et al [85] introduce to use dynamics randomization to generalize RL locomotion policy in different environments, and Peng et al [64] use animal videos to provide demonstrations for imitation learning.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

Yang¹,

Zhang²,

Hansen³

et al. 2021

Preprint

View full text Add to dashboard Cite

We propose to address quadrupedal locomotion tasks using Reinforcement Learning (RL) with a Transformer-based model that learns to combine proprioceptive information and high-dimensional depth sensor inputs. While learning-based locomotion has made great advances using RL, most methods still rely on domain randomization for training blind agents that generalize to challenging terrains. Our key insight is that proprioceptive states only offer contact measurements for immediate reaction, whereas an agent equipped with visual sensory observations can learn to proactively maneuver environments with obstacles and uneven terrain by anticipating changes in the environment many steps ahead. In this paper, we introduce Loco-Transformer, an end-to-end RL method for quadrupedal locomotion that leverages a Transformer-based model for fusing proprioceptive states and visual observations. We evaluate our method in challenging simulated environments with different obstacles and uneven terrain. We show that our method obtains significant improvements over policies with only proprioceptive state inputs, and that Transformer-based models further improve generalization across environments. Our project page with videos is at https://RchalYang.github.io/LocoTransformer .

show abstract

Section: Related Workmentioning

confidence: 99%

“…Finally, we combine both features for policy action prediction. The resulting model is trained end-to-end directly using rewards, without hierarchical RL [62,41,31,38] nor pre-defined controllers [15,21].…”

Section: Introductionmentioning

confidence: 99%

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

Yang¹,

Zhang²,

Hansen³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Hierarchical Reinforcement Learning (HRL). [9], [10] is widely used for robotic tasks, such as manipulation [5], navigation [6], and locomotion [26]. In this work, we refrain from learning the low-level policy parameters, hence our method is not HRL-based.…”

Section: Related Workmentioning

confidence: 99%

Learning Robotic Manipulation Skills Using an Adaptive Force-Impedance Action Space

Ulmer¹,

Aljalbout²,

Schwarz³

et al. 2021

Preprint

View full text Add to dashboard Cite

Intelligent agents must be able to think fast and slow to perform elaborate manipulation tasks. Reinforcement Learning (RL) has led to many promising results on a range of challenging decision-making tasks. However, in real-world robotics, these methods still struggle, as they require large amounts of expensive interactions and have slow feedback loops. On the other hand, fast human-like adaptive control methods can optimize complex robotic interactions, yet fail to integrate multimodal feedback needed for unstructured tasks. In this work, we propose to factor the learning problem in a hierarchical learning and adaption architecture to get the best of both worlds. The framework consists of two components, a slow reinforcement learning policy optimizing the task strategy given multimodal observations, and a fast, real-time adaptive control policy continuously optimizing the motion, stability, and effort of the manipulator. We combine these components through a bio-inspired action space that we call AFORCE. We demonstrate the new action space on a contact-rich manipulation task on real hardware and evaluate its performance on three simulated manipulation tasks. Our experiments show that AFORCE drastically improves sample efficiency while reducing energy consumption and improving safety.

show abstract

“…Hierarchical Reinforcement Learning. Hierarchical Reinforcement Learning (HRL) [25], [26] has also proven to be an effective tool in tackling the locomotion problem. Furthermore, researchers [27] have developed HRL strategies for controlling legged characters in simulation, which have been further combined with adversarial learning [28] to achieve high-level control.…”

Section: Related Workmentioning

confidence: 99%

Solving Challenging Control Problems Using Two-Staged Deep Reinforcement Learning

Sontakke

2021

Preprint

View full text Add to dashboard Cite

We present a two-staged deep reinforcement learning algorithm for solving challenging control problems. Deep reinforcement learning (deep RL) has been an effective tool for solving many high-dimensional continuous control problems, but it cannot effectively solve challenging problems with certain properties, such as sparse reward functions or sensitive dynamics. In this work, we propose an approach that decomposes the given problem into two stages: motion planning and motion imitation. The motion planning stage seeks to compute a feasible motion plan with approximated dynamics by directly sampling the state space rather than exploring random control signals.Once the motion plan is obtained, the motion imitation stage learns a control policy that can imitate the given motion plan with realistic sensors and actuations. We demonstrate that our approach can solve challenging control problemsrocket navigation and quadrupedal locomotion -which cannot be solved by the standard MDP formulation. The supplemental video can be found at: https://youtu.be/FYLo1Ov_8-g

show abstract

Hierarchical Reinforcement Learning for Quadruped Locomotion

Cited by 40 publications

References 14 publications

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

Learning Robotic Manipulation Skills Using an Adaptive Force-Impedance Action Space

Solving Challenging Control Problems Using Two-Staged Deep Reinforcement Learning

Contact Info

Product

Resources

About