2018 IEEE International Conference on Robotics and Automation (ICRA) 2018
DOI: 10.1109/icra.2018.8460984
|View full text |Cite
|
Sign up to set email alerts
|

Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware

Abstract: Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the concept of shaping the reward landscape with training wheels; temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
1

Relationship

3
4

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 18 publications
(18 reference statements)
0
7
0
Order By: Relevance
“…The knee joint stiffness is realized by a spring that wraps around the knee joint on a cam mechanism, inspired by a knee cap (patella) (Allen et al, 2017; Heim et al, 2018). The knee cam mechanism linearizes the knee spring deflection over knee angle.…”
Section: Methodsmentioning
confidence: 99%
“…The knee joint stiffness is realized by a spring that wraps around the knee joint on a cam mechanism, inspired by a knee cap (patella) (Allen et al, 2017; Heim et al, 2018). The knee cam mechanism linearizes the knee spring deflection over knee angle.…”
Section: Methodsmentioning
confidence: 99%
“…Randlov proved that for a finite Markovian decision process with a limited reward signal, it is guaranteed that if a series of tasks converges to the original one, then the optimal value function converges to the original one as well [25]. In [26], a temporary device to reduce gravity helped the learning of single-leg hopping, showcasing the potential of shaping in real-world applications. The same concept is also known by the term curriculum learning in RL [27].…”
Section: Introductionmentioning
confidence: 99%
“…Despite the good results in the simulated environments and the large application of the actor-critic approach, some authors emphasize the challenge of implementing RL algorithms directly on the real plant [24]. The full control of the vehicle commands for example can generate unsafe behaviors as accelerating the vehicle against obstacles, or driving the vehicle to its dynamic limits, causing accidents.…”
Section: Introductionmentioning
confidence: 99%