Dynamics Randomization Revisited:A Case Study for Quadrupedal Locomotion

Xie, Zhiqiang; Da, Xingye; Panne, Michiel van de; Babich, B. N.; Garg, Animesh

doi:10.48550/arxiv.2011.02404

Cited by 4 publications

(6 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is often realized via a comprehensive simulation of the robot, e.g., [9,19,20]. Rewards designed based on reference trajectories [6,7] or carefully tuned reward terms [8,10,21,22] are often necessary to regularize undesirable behaviors as to be feasible for a physical robot. The computation cost to train a policy often requires millions to billions of transition tuples of the full physics simulation.…”

Section: B Deep Reinforcement Learning For Quadrupedal Robotsmentioning

confidence: 99%

“…The learned policy works directly for the inverted-configuration robot. We also train a trotting policy for the default Laikago, utilizing the full physics simulation and directly generate commands at the joint control level, similar to [6,7]. This end-to-end trained policy works well with the default configuration of Laikago, as expected, but fails to generalize to the inverted configuration due to the learned control policy being very specific to the morphology it was trained on.…”

Section: A Trotting and Walking On Flat Terrainmentioning

confidence: 99%

“…Since the inertia we use for the simplified model is an approximation of the full-order model parameters, a learned policy trained with the default inertia parameter fails on the full order model. We employ dynamics randomization, a technique frequently relied upon in RL training of quadrupedal locomotion [8,9,20,7] and randomize the inertia parameters by 50 to 150 percent of the default values. The learned policy trained with the randomized centroidal model of the A1 is able to achieve two-legged balancing on the full A1 model.…”

Section: Two-legged Balancingmentioning

confidence: 99%

See 2 more Smart Citations

GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model

Xie¹,

Da²,

Babich³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Model-free reinforcement learning (RL) for legged locomotion commonly relies on a physics simulator that can accurately predict the behaviors of every degree of freedom of the robot. In contrast, approximate reduced-order models are often sufficient for many model-based control strategies. In this work we explore how RL can be effectively used with a centroidal model to generate robust control policies for quadrupedal locomotion. Advantages over RL with a full-order model include a simple reward structure, reduced computational costs, and robust sim-to-real transfer. We further show the potential of the method by demonstrating stepping-stone locomotion, twolegged in-place balance, balance beam locomotion, and sim-toreal transfer without further adaptations. Additional Results: https://www.pair.toronto.edu/glide-quadruped/.

show abstract

Section: B Deep Reinforcement Learning For Quadrupedal Robotsmentioning

confidence: 99%

Section: A Trotting and Walking On Flat Terrainmentioning

confidence: 99%

Section: Two-legged Balancingmentioning

confidence: 99%

See 1 more Smart Citation

GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model

Xie¹,

Da²,

Babich³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…To overcome the sim-to-real gap, previous work also utilizes domain randomization to train a robust policy to adapt to a wide range of dynamic parameter settings [18]. However, recent work argues that it is possible to transfer the simulation controller directly with domain adaptation, by calibrating the dynamic parameters in simulation [28]. In this work, we use domain adaptation to narrow the sim-toreal gap instead.…”

Section: B Domain Adaptationmentioning

confidence: 99%

Reinforcement Learning with Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion

Shi¹,

Zhang²,

Zeng³

et al. 2021

Preprint

View full text Add to dashboard Cite

“…Legged Locomotion: This has conventionally been accomplished using control theory [2,5,6,22,28,31,33,39,55,63,72,88] over handcrafted dynamics models. Recently, RL has been successfully used to learn such policies in simulation [21,49,56,68] and in the real world with sim2real methods [25,29,59,61,75,75,77,85]. Alternatively, a policy learnt in simulation can be adapted at test-time to work well in real environments [15,19,45,62,70,71,[89][90][91][92]95].…”

Section: Related Workmentioning

confidence: 99%

Coupling Vision and Proprioception for Navigation of Legged Robots

Fu¹,

Kumar²,

Agarwal³

et al. 2021

Preprint

View full text Add to dashboard Cite

We exploit the complementary strengths of vision and proprioception to achieve point goal navigation in a legged robot. Legged systems are capable of traversing more complex terrain than wheeled robots, but to fully exploit this capability, we need the high-level path planner in the navigation system to be aware of the walking capabilities of the low-level locomotion policy on varying terrains. We achieve this by using proprioceptive feedback to estimate the safe operating limits of the walking policy, and to sense unexpected obstacles and terrain properties like smoothness or softness of the ground that may be missed by vision. The navigation system uses onboard cameras to generate an occupancy map and a corresponding cost map to reach the goal. The FMM (Fast Marching Method) planner then generates a target path. The velocity command generator takes this as input to generate the desired velocity for the locomotion policy using as input additional constraints, from the safety advisor, of unexpected obstacles and terrain determined speed limits. We show superior performance compared to wheeled robot (LoCoBot) baselines, and other baselines which have disjoint high-level planning and low-level control. We also show the real-world deployment of our system on a quadruped robot with onboard sensors and compute. Videos at https://navigationlocomotion.github.io/camera-ready

show abstract

Dynamics Randomization Revisited:A Case Study for Quadrupedal Locomotion

Cited by 4 publications

References 21 publications

GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model

GLiDE: Generalizable Quadrupedal Locomotion in Diverse Environments with a Centroidal Model

Reinforcement Learning with Evolutionary Trajectory Generator: A General Approach for Quadrupedal Locomotion

Coupling Vision and Proprioception for Navigation of Legged Robots

Contact Info

Product

Resources

About