Reinforcement Learning Meets Hybrid Zero Dynamics: A Case Study for RABBIT

Castillo, Guillermo A.; Weng, Bowen; Hereid, Ayonga; Wang, Zheng; Zhang, Wei

doi:10.1109/icra.2019.8793627

Cited by 20 publications

(19 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this section, we build upon our previous work proposed in [11], [13] to implement a cascade-structure learning framework that realizes stable and robust walking gaits for the 3D bipedal robots. The specific design ensures successful transferring learned policies in simulation to robot hardware with minimal turning.…”

Section: Learning Approachmentioning

confidence: 99%

“…Foot placement regulation controller has been widely used in 3D bipedal walking robots with the objective of improving the speed tracking and the stability and robustness of the walking gait [20]- [22]. Longitudinal speed regulation, defined by (13), sets a target offset in the swing hip pitch joint, whereas lateral speed regulation (11) do the same for the swing hip roll angle. Direction regulation (12) add an offset to the yaw hip angle to keep the torso yaw orientation at the desired angle.…”

Section: B Feedback Regulationsmentioning

confidence: 99%

“…[8], [9] learned the Poincaré map of the periodic walking pattern and applied the method to two 2D bipedal robots. Some recent work has proposed to learn the joint-level trajectory for each joint as the reference motion through supervised learning [10] or using reinforcement learning [11]- [13]. This approach simplifies the design of the lower-level tracking, which can be as simple as a PD controller.…”

Section: Introductionmentioning

confidence: 99%

“…Motivated by the existing challenges, this paper enhances the hybrid zero dynamics inspired reinforcement learning framework presented in our previous work [11], [13] emphasizing the policy robustness and sim-to-real transfer. The proposed cascade control structure couples the high-level learning-based controller with the low-level model-based controller.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Robust Feedback Motion Policy Design Using Reinforcement Learning on a 3D Digit Bipedal Robot

Castillo¹,

Weng²,

Zhang³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

In this paper, a hierarchical and robust framework for learning bipedal locomotion is presented and successfully implemented on the 3D biped robot Digit built by Agility Robotics. We propose a cascade-structure controller that combines the learning process with intuitive feedback regulations. This design allows the framework to realize robust and stable walking with a reduced-dimension state and action spaces of the policy, significantly simplifying the design and reducing the sampling efficiency of the learning method. The inclusion of feedback regulation into the framework improves the robustness of the learned walking gait and ensures the success of the sim-to-real transfer of the proposed controller with minimal tuning. We specifically present a learning pipeline that considers hardware-feasible initial poses of the robot within the learning process to ensure the initial state of the learning is replicated as close as possible to the initial state of the robot in hardware experiments. Finally, we demonstrate the feasibility of our method by successfully transferring the learned policy in simulation to the Digit robot hardware, realizing sustained walking gaits under external force disturbances and challenging terrains not included during the training process. To the best of our knowledge, this is the first time a learning-based policy is transferred successfully to the Digit robot in hardware experiments without using dynamic randomization or curriculum learning.

show abstract

Section: Learning Approachmentioning

confidence: 99%

Section: B Feedback Regulationsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Robust Feedback Motion Policy Design Using Reinforcement Learning on a 3D Digit Bipedal Robot

Castillo¹,

Weng²,

Zhang³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…To circumvent this engineering empiricism, the field of machine learning has approached bipedal locomotion from many perspectives, including reinforcement leaning and imitation learning. Reinforcement learning simplifies the process of "learning to walk" [13] without prior knowledge [14]- [17], but because this method relies on a carefully crafted reward function, the behavior is exclusively determined by its construction. This motivates the second method, imitation learning, which infers the underlying reward function from expert demonstrations [18]- [20].…”

Section: Introductionmentioning

confidence: 99%

Preference-Based Learning for User-Guided HZD Gait Generation on Bipedal Walking Robots

Tucker

Csomay-Shanklin

et al. 2021

2021 IEEE International Conference on Robotics and Automation (ICRA)

View full text Add to dashboard Cite

This paper presents a framework that unifies control theory and machine learning in the setting of bipedal locomotion. Traditionally, gaits are generated through trajectory optimization methods and then realized experimentallya process that often requires extensive tuning due to differences between the models and hardware. In this work, the process of gait realization via hybrid zero dynamics (HZD) based optimization problems is formally combined with preferencebased learning to systematically realize dynamically stable walking. Importantly, this learning approach does not require a carefully constructed reward function, but instead utilizes human pairwise preferences. The power of the proposed approach is demonstrated through two experiments on a planar biped AMBER-3M: the first with rigid point feet, and the second with induced model uncertainty through the addition of springs where the added compliance was not accounted for in the gait generation or in the controller. In both experiments, the framework achieves stable, robust, efficient, and natural walking in fewer than 50 iterations with no reliance on a simulation environment. These results demonstrate a promising step in the unification of control theory and learning.

show abstract

Reinforcement Learning-Based Cascade Motion Policy Design for Robust 3D Bipedal Locomotion

et al. 2022

View full text Add to dashboard Cite

This paper presents a novel reinforcement learning (RL) framework to design cascade feedback control policies for 3D bipedal locomotion. Existing RL algorithms are often trained in an endto-end manner or rely on prior knowledge of some reference joint or task space trajectories. Unlike these studies, we propose a policy structure that decouples the bipedal locomotion problem into two modules that incorporate the physical insights from the nature of the walking dynamics and the well-established Hybrid Zero Dynamics approach for 3D bipedal walking. As a result, the overall RL framework has several key advantages, including lightweight network structure, sample efficiency, and less dependence on prior knowledge. The proposed solution learns stable and robust walking gaits from scratch and allows the controller to realize omnidirectional walking with accurate tracking of the desired velocity and heading angle. The learned policies also perform robustly against various adversarial forces applied to the torso and walking blindly on a series of challenging and unstructured terrains. These results demonstrate that the proposed cascade feedback control policy is suitable for navigation of 3D bipedal robots in indoor and outdoor environments.

show abstract

Reinforcement Learning Meets Hybrid Zero Dynamics: A Case Study for RABBIT

Cited by 20 publications

References 31 publications

Robust Feedback Motion Policy Design Using Reinforcement Learning on a 3D Digit Bipedal Robot

Robust Feedback Motion Policy Design Using Reinforcement Learning on a 3D Digit Bipedal Robot

Preference-Based Learning for User-Guided HZD Gait Generation on Bipedal Walking Robots

Reinforcement Learning-Based Cascade Motion Policy Design for Robust 3D Bipedal Locomotion

Contact Info

Product

Resources

About