Cat-Like Jumping and Landing of Legged Robots in Low Gravity Using Deep Reinforcement Learning

Rudin, Nikita; Kolvenbach, Hendrik; Tsounis, Vassilios; Hutter, Marco

doi:10.1109/tro.2021.3084374

Cited by 94 publications

(93 citation statements)

References 30 publications

(36 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There are some works that use RL to deal with quadruped robot jumping. They are used to address the (re)orientation problem of the robot's 3D posture during the jumping flight phase in the case of low gravity (e.g., moon) [7], or compensating for the error of the jumping trajectory caused by disturbance [8], and training the robot to have cat-like action to ensure the landing phase's safety. However, while these systems have the advantage of transferring the policy to the robot's onboard computer after training and computing the necessary behavior from policy in a short time.…”

Section: ) Reinforcement Learning (Rl)mentioning

confidence: 99%

“…However, while these systems have the advantage of transferring the policy to the robot's onboard computer after training and computing the necessary behavior from policy in a short time. [7], [9] However, extensive data collecting is required in the early stages. Meanwhile, it does not develop a motion planning policy to conduct many complex jumping, such as doing a left-flip and a back-flip at the same time, nor does it consider the problem of optimal energy consumption to select the best trajectory from plausible options.…”

Section: ) Reinforcement Learning (Rl)mentioning

confidence: 99%

See 1 more Smart Citation

An Optimal Motion Planning Framework for Quadruped Jumping

Zhang¹,

Yue²,

Sun³

et al. 2022

Preprint

View full text Add to dashboard Cite

This paper presents an optimal motion planning framework to generate versatile energy-optimal quadrupedal jumping motions automatically (e.g., flips, spin). The jumping motions via the centroidal dynamics are formulated as a 12dimensional black-box optimization problem subject to the robot kino-dynamic constraints. Gradient-based approaches offer great success in addressing trajectory optimization (TO), yet, prior knowledge (e.g., reference motion, contact schedule) is required and results in sub-optimal solutions. The new proposed framework first employed a heuristics-based optimization method to avoid these problems. Moreover, a prioritization fitness function is created for heuristics-based algorithms in robot ground reaction force (GRF) planning, enhancing convergence and searching performance considerably. Since heuristicsbased algorithms often require significant time, motions are planned offline and stored as a pre-motion library. A selector is designed to automatically choose motions with user-specified or perception information as input. The proposed framework has been successfully validated only with a simple continuously tracking PD controller in an open-source Mini-Cheetah by several challenging jumping motions, including jumping over a window-shaped obstacle with 30 cm height and left-flipping over a rectangle obstacle with 27 cm height. (Video ⋆

show abstract

Section: ) Reinforcement Learning (Rl)mentioning

confidence: 99%

Section: ) Reinforcement Learning (Rl)mentioning

confidence: 99%

An Optimal Motion Planning Framework for Quadruped Jumping

Zhang¹,

Yue²,

Sun³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Deep reinforcement learning (RL) is now actively used in many areas, including playing games [25,33], robot manipulation [24,16], and legged robotics [13,29]. The leading (general-purpose) algorithms within the continuous control RL community are either deterministic, such as DDPG [23] and TD3 [5], or stochastic, such as SAC [11] and PPO [31].…”

Section: Introductionmentioning

confidence: 99%

Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning

James¹,

Abbeel²

2022

Preprint

View full text Add to dashboard Cite

We propose a new policy parameterization for representing 3D rotations during reinforcement learning. Today in the continuous control reinforcement learning literature, many stochastic policy parameterizations are Gaussian. We argue that universally applying a Gaussian policy parameterization is not always desirable for all environments. One such case in particular where this is true are tasks that involve predicting a 3D rotation output, either in isolation, or coupled with translation as part of a full 6D pose output. Our proposed Bingham Policy Parameterization (BPP) models the Bingham distribution and allows for better rotation (quaternion) prediction over a Gaussian policy parameterization in a range of reinforcement learning tasks. We evaluate BPP on the rotation Wahba problem task, as well as a set of vision-based next-best pose robot manipulation tasks from RLBench. We hope that this paper encourages more research into developing other policy parameterization that are more suited for particular environments, rather than always assuming Gaussian.

show abstract

“…Several other approaches such as reinforcement learning or model-predictive control (MPC) present interesting landing behaviors also demonstrated on hardware. [9] demonstrated planar landing and airborne orientation control on the SpaceBok quadruped in a "low-gravity" environment, but it is unclear how easily the algorithm could be applied for real-world, 3D conditions with more dramatic impacts and inertial effects. While planar landing and jumping was also demonstrated in [10] and [11], touchdown was made without considering optimal touchdown positions or timings, and were from relatively low heights with little pitch or roll of the body.…”

Section: Introductionmentioning

confidence: 99%

Real-time Optimal Landing Control of the MIT Mini Cheetah

Jeon¹,

Kim²,

Kim³

2021

Preprint

View full text Add to dashboard Cite

Quadrupedal landing is a complex process involving large impacts, elaborate contact transitions, and is a crucial recovery behavior observed in many biological animals. This work presents a real-time, optimal landing controller that is free of pre-specified contact schedules. The controller determines optimal touchdown postures and reaction force profiles and is able to recover from a variety of falling configurations. The quadrupedal platform used, the MIT Mini Cheetah, recovered safely from drops of up to 8 m in simulation, as well as from a range of orientations and planar velocities. The controller is also tested on hardware, successfully recovering from drops of up to 2 m.

show abstract

Cat-Like Jumping and Landing of Legged Robots in Low Gravity Using Deep Reinforcement Learning

Cited by 94 publications

References 30 publications

An Optimal Motion Planning Framework for Quadruped Jumping

An Optimal Motion Planning Framework for Quadruped Jumping

Bingham Policy Parameterization for 3D Rotations in Reinforcement Learning

Real-time Optimal Landing Control of the MIT Mini Cheetah

Contact Info

Product

Resources

About