2021 IEEE International Conference on Robotics and Automation (ICRA) 2021
DOI: 10.1109/icra48506.2021.9561515
|View full text |Cite
|
Sign up to set email alerts
|

Preference-Based Learning for User-Guided HZD Gait Generation on Bipedal Walking Robots

Abstract: This paper presents a framework that unifies control theory and machine learning in the setting of bipedal locomotion. Traditionally, gaits are generated through trajectory optimization methods and then realized experimentallya process that often requires extensive tuning due to differences between the models and hardware. In this work, the process of gait realization via hybrid zero dynamics (HZD) based optimization problems is formally combined with preferencebased learning to systematically realize dynamica… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 37 publications
(44 reference statements)
0
8
0
Order By: Relevance
“…As such, both pilots requested that the wheel position tracking gains of the stabilization controller be reduced significantly. These findings suggest that our stabilization controller with a human in the loop may benefit from preference based control designs [21].…”
Section: Human-robot Interactionmentioning
confidence: 86%
“…As such, both pilots requested that the wheel position tracking gains of the stabilization controller be reduced significantly. These findings suggest that our stabilization controller with a human in the loop may benefit from preference based control designs [21].…”
Section: Human-robot Interactionmentioning
confidence: 86%
“…The input layer maps the bending sensor signals into the Reservoir Computing space, while the reservoir layer is composed of a randomly connected network of neurons that transforms the input into a high-dimensional feature space. 14,15 The Reservoir Computing layer then feeds the transformed input into a linear readout layer to estimate the system's response, the benefit for Reservoir Computing is that it able to analysis chaotic system which suitable for soft robotics dynamic modeling. 13,16 The only trainable part of reservoir computing is the readout, which is ridge node, can be trained by state of training and y[t], where the state of training is the activation of the reservoir trigger by x[t], 10 as shown in Figure 1.…”
Section: Reservoir Computingmentioning
confidence: 99%
“…We refer to this setting as "preference-based learning". In this work, we utilize a more recent preference-based learning algorithm, LineCoSpar [23] with the addition of ordinal labels inspired from [24], which maintains the posterior only over a subset of the entire actions space to increase computation tractability -more details can be found in [14]. The resulting learning framework iteratively applies Thompson sampling to navigate a high-dimensional Bayesian landscape of user preferences.…”
Section: Learning Frameworkmentioning
confidence: 99%