2010 10th IEEE-RAS International Conference on Humanoid Robots 2010
DOI: 10.1109/ichr.2010.5686320
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement learning of full-body humanoid motor skills

Abstract: Applying reinforcement learning to humanoid robots is challenging because humanoids have a large number of degrees of freedom and state and action spaces are continuous. Thus, most reinforcement learning algorithms would become computationally infeasible and require a prohibitive amount of trials to explore such high-dimensional spaces. In this paper, we present a probabilistic reinforcement learning approach, which is derived from the framework of stochastic optimal control and path integrals. The algorithm, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
37
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 46 publications
(42 citation statements)
references
References 11 publications
1
37
0
Order By: Relevance
“…Through three emblematic scenarios, we showed how variable task weights resolve a broad set of issues encountered in multi-task execution with minimal tuning and in a reactive manner. In addition to the variance to weights mapping, we developed a method of computing variance for a single trajectory demonstration using a covariance function (8). This tool is essential in cases where only one trajectory has been provided for the task, as in trajectory generation.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Through three emblematic scenarios, we showed how variable task weights resolve a broad set of issues encountered in multi-task execution with minimal tuning and in a reactive manner. In addition to the variance to weights mapping, we developed a method of computing variance for a single trajectory demonstration using a covariance function (8). This tool is essential in cases where only one trajectory has been provided for the task, as in trajectory generation.…”
Section: Discussionmentioning
confidence: 99%
“…, can be obtained through multiple demonstrations 2 as in [5], [7], [8], [9], or computed from scratch. The concatenation of these position means and variances, respectively yields M Υ and V Υ for the given trajectory, Υ.…”
Section: B Task Formalismmentioning
confidence: 99%
See 1 more Smart Citation
“…[3], [4]). Among different approaches in this area, reinforcement learning algorithms have shown good performance both in simulation and in real world applications.…”
Section: B Model-free Algorithmsmentioning
confidence: 99%
“…The goal is to control the robot from an initial state to a final state within 6 seconds. At the final time, the center of the ball should be at the [3,1] T meter from its starting position and the robot should also have zero velocity, zero tilt angles and tilt angle rates as well as zero change in the heading. The designed cost function has the same form as in (3) where intermediate l (x, u) and final h (x) costs are defined as follows:…”
Section: ) Cost Functionmentioning
confidence: 99%