Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors

Bohez, Steven; Tunyasuvunakool, Saran; Brakel, Philémon; Sadeghi, Fereshteh; Hasenclever, Leonard; Tassa, Yuval; Parisotto, Emilio; Humplík, Jan; Haarnoja, Tuomas; Hafner, Roland; Wulfmeier, Markus; Neunert, Michael; Moran, Ben; Siegel, Noah; Huber, Andrea; Romano, F.; Batchelor, Nathan; Casarini, Federico; Merel, Josh; Hadsell, Raia; Heess, Nicolas

doi:10.48550/arxiv.2203.17138

Cited by 8 publications

(19 citation statements)

References 50 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, without a tedious reward-engineering process dedicated to the task at hand, it generally struggles to discover the right behaviors. This can be addressed by leveraging expert demonstrations: Multimodal trajectories can be encoded into motion priors that are then used to guide the training of a robust RL agent (39,81,82) in complex loco-manipulation settings. However, instead of relying on expert trajectories generated by human examples (83,84), one could potentially use our framework as an automatic provider of physically consistent demonstrations for an RL pipeline.…”

Section: Discussionmentioning

confidence: 99%

“…One common way of overcoming this issue has been by bootstrapping RL with a form of imitation learning. For example, training can be guided by recorded human demonstrations (28,29), animal motion capture clips (38,39), or an RL-trained teacher with access to privileged information (27,40,41). Nevertheless, generating expert demonstrations for every newly encountered task is time consuming, and motion retargeting is often challenging, especially when the robot's morphology differs from that of the demonstrator.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Versatile multicontact planning and control for legged loco-manipulation

2023

View full text Add to dashboard Cite

Loco-manipulation planning skills are pivotal for expanding the utility of robots in everyday environments. These skills can be assessed on the basis of a system’s ability to coordinate complex holistic movements and multiple contact interactions when solving different tasks. However, existing approaches have been merely able to shape such behaviors with hand-crafted state machines, densely engineered rewards, or prerecorded expert demonstrations. Here, we propose a minimally guided framework that automatically discovers whole-body trajectories jointly with contact schedules for solving general loco-manipulation tasks in premodeled environments. The key insight is that multimodal problems of this nature can be formulated and treated within the context of integrated task and motion planning (TAMP). An effective bilevel search strategy was achieved by incorporating domain-specific rules and adequately combining the strengths of different planning techniques: trajectory optimization and informed graph search coupled with sampling-based planning. We showcase emergent behaviors for a quadrupedal mobile manipulator exploiting both prehensile and nonprehensile interactions to perform real-world tasks such as opening/closing heavy dishwashers and traversing spring-loaded doors. These behaviors were also deployed on the real system using a two-layer whole-body tracking controller.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Versatile multicontact planning and control for legged loco-manipulation

2023

View full text Add to dashboard Cite

show abstract

“…[13] revisited the necessity of dynamics randomization in legged locomotion and gave suggestions on where and how to use dynamics randomization. [14] trained a low-level quadruped robot locomotion controller by imitating real animal data and used the low-level controller to accomplish different tasks. [15] trained policies to jump from pixel inputs.…”

Section: Deep Reinforcement Learning For Legged Locomotionmentioning

confidence: 99%

Learning Agile, Robust Locomotion Skills for Quadruped Robot

Zhang

Wei

Chang

et al. 2022

2022 International Conference on Advanced Robotics and Mechatronics (ICARM)

View full text Add to dashboard Cite

The successful transfer of a learned controller from simulation to the real world for a legged robot requires not only the ability to identify the system, but also accurate estimation of the robot's state. In this paper, we propose a novel algorithm that can infer not only information about the parameters of the dynamic system, but also estimate important information about the robot's state from previous observations. We integrate our algorithm with Adversarial Motion Priors and achieve a robust, agile, and natural gait in both simulation and on a Unitree A1 quadruped robot in the real world. Empirical results demonstrate that our proposed algorithm enables traversing challenging terrains with lower power consumption compared to the baselines. Both qualitative and quantitative results are presented in this paper. Videos at https://youtu.be/7Ggcj6IzfhM.

show abstract

“…A notable line of research has focused on learning a specific task while imitating expert behavior. The expert provides a direct demonstration for solving the task (22,23) or is used to impose a style while discovering the task (24)(25)(26). These approaches require collecting expert data, commonly done offline, through either retargeted motion capture data (24)(25)(26) or a TO technique (22,23).…”

Section: Introductionmentioning

confidence: 99%

“…The expert provides a direct demonstration for solving the task (22,23) or is used to impose a style while discovering the task (24)(25)(26). These approaches require collecting expert data, commonly done offline, through either retargeted motion capture data (24)(25)(26) or a TO technique (22,23). The reward function can then be formulated to be dense, meaning that agents can collect nontrivial rewards even if they do not initially solve the task.…”

Section: Introductionmentioning

confidence: 99%

DTC: Deep Tracking Control

Jenelten,

He,

Farshidian

et al. 2024

Sci. Robot.

View full text Add to dashboard Cite

Legged locomotion is a complex control problem that requires both accuracy and robustness to cope with real-world challenges. Legged systems have traditionally been controlled using trajectory optimization with inverse dynamics. Such hierarchical model-based methods are appealing because of intuitive cost function tuning, accurate planning, generalization, and, most importantly, the insightful understanding gained from more than one decade of extensive research. However, model mismatch and violation of assumptions are common sources of faulty operation. Simulation-based reinforcement learning, on the other hand, results in locomotion policies with unprecedented robustness and recovery skills. Yet, all learning algorithms struggle with sparse rewards emerging from environments where valid footholds are rare, such as gaps or stepping stones. In this work, we propose a hybrid control architecture that combines the advantages of both worlds to simultaneously achieve greater robustness, foot-placement accuracy, and terrain generalization. Our approach uses a model-based planner to roll out a reference motion during training. A deep neural network policy is trained in simulation, aiming to track the optimized footholds. We evaluated the accuracy of our locomotion pipeline on sparse terrains, where pure data-driven methods are prone to fail. Furthermore, we demonstrate superior robustness in the presence of slippery or deformable ground when compared with model-based counterparts. Last, we show that our proposed tracking controller generalizes across different trajectory optimization methods not seen during training. In conclusion, our work unites the predictive capabilities and optimality guarantees of online planning with the inherent robustness attributed to offline learning.

show abstract

Imitate and Repurpose: Learning Reusable Robot Movement Skills From Human and Animal Behaviors

Cited by 8 publications

References 50 publications

Versatile multicontact planning and control for legged loco-manipulation

Versatile multicontact planning and control for legged loco-manipulation

Learning Agile, Robust Locomotion Skills for Quadruped Robot

DTC: Deep Tracking Control

Contact Info

Product

Resources

About