Reinforcement learning of full-body humanoid motor skills

Stulp, Freek; Buchli, Jonas; Theodorou, Evangelos A.; Schaal, Stefan

doi:10.1109/ichr.2010.5686320

Cited by 46 publications

(42 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Through three emblematic scenarios, we showed how variable task weights resolve a broad set of issues encountered in multi-task execution with minimal tuning and in a reactive manner. In addition to the variance to weights mapping, we developed a method of computing variance for a single trajectory demonstration using a covariance function (8). This tool is essential in cases where only one trajectory has been provided for the task, as in trajectory generation.…”

Section: Discussionmentioning

confidence: 99%

“…, can be obtained through multiple demonstrations 2 as in [5], [7], [8], [9], or computed from scratch. The concatenation of these position means and variances, respectively yields M Υ and V Υ for the given trajectory, Υ.…”

Section: B Task Formalismmentioning

confidence: 99%

“…Task variance is measured from the variability of the learned motions and may be adapted based on new demonstrations [7], [8], [9]. An inverse relationship between the task variance and the K p gain is then formed to regulate the attractor term during the movement.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Variance modulated task prioritization in Whole-Body Control

Lober

Padois

Sigaud

2015

2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

Abstract-Whole-Body Control methods offer the potential to execute several tasks on highly redundant robots, such as humanoids. Unfortunately, task combinations often result in incompatibilities which generate undesirable behaviors. Prioritization techniques can prevent tasks from perturbing one another but often to the detriment of the lower precedence tasks. For many tasks, static prioritization is not necessary or even appropriate because tasks can often be achieved in variable ways, as in reaching. In this paper, we show that such task variability can be used to modulate task priorities during execution, to temporarily deviate certain tasks as needed, in the presence of incompatibilities. We first present a method for mapping from task variance to task priority and then provide an approach for computing task variance. Through three common conflict scenarios, we demonstrate that mapping from task variance to priorities reactively solves a number of task incompatibilities.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: B Task Formalismmentioning

confidence: 99%

See 1 more Smart Citation

Variance modulated task prioritization in Whole-Body Control

Lober

Padois

Sigaud

2015

2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

show abstract

“…[3], [4]). Among different approaches in this area, reinforcement learning algorithms have shown good performance both in simulation and in real world applications.…”

Section: B Model-free Algorithmsmentioning

confidence: 99%

“…The goal is to control the robot from an initial state to a final state within 6 seconds. At the final time, the center of the ball should be at the [3,1] T meter from its starting position and the robot should also have zero velocity, zero tilt angles and tilt angle rates as well as zero change in the heading. The designed cost function has the same form as in (3) where intermediate l (x, u) and final h (x) costs are defined as follows:…”

Section: ) Cost Functionmentioning

confidence: 99%

Learning of closed-loop motion control

Farshidian

Neunert

Buchli

2014

2014 IEEE/RSJ International Conference on Intelligent Robots and Systems

Self Cite

View full text Add to dashboard Cite

Abstract-Learning motion control as a unified process of designing the reference trajectory and the controller is one of the most challenging problems in robotics. The complexity of the problem prevents most of the existing optimization algorithms from giving satisfactory results. While model-based algorithms like iterative linear-quadratic-Gaussian (iLQG) can be used to design a suitable controller for the motion control, their performance is strongly limited by the model accuracy. An inaccurate model may lead to degraded performance of the controller on the physical system. Although using machine learning approaches to learn the motion control on real systems have been proven to be effective, their performance depends on good initialization. To address these issues, this paper introduces a two-step algorithm which combines the proven performance of a model-based controller with a model-free method for compensating for model inaccuracy. The first step optimizes the problem using iLQG. Then, in the second step this controller is used to initialize the policy for our PI 2 -01 reinforcement learning algorithm. This algorithm is a derivation of the PI 2 algorithm enabling more stable and faster convergence. The performance of this method is demonstrated both in simulation and experimental results.

show abstract

A Machine Learning System for Controlling a Rescue Robot

Bratko

Sammut

2018

RoboCup 2017: Robot World Cup XXI

View full text Add to dashboard Cite

Reinforcement learning of full-body humanoid motor skills

Cited by 46 publications

References 11 publications

Variance modulated task prioritization in Whole-Body Control

Variance modulated task prioritization in Whole-Body Control

Learning of closed-loop motion control

A Machine Learning System for Controlling a Rescue Robot

Contact Info

Product

Resources

About