2022
DOI: 10.48550/arxiv.2206.11693
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning Agile Skills via Adversarial Imitation of Rough Partial Demonstrations

Abstract: Learning agile skills is one of the main challenges in robotics. To this end, reinforcement learning approaches have achieved impressive results. These methods require explicit task information in terms of a reward function or an expert that can be queried in simulation to provide a target control output, which limits their applicability. In this work, we propose a generative adversarial method for inferring reward functions from partial and potentially physically incompatible demonstrations for successful ski… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 29 publications
0
2
0
Order By: Relevance
“…Inverse reinforcement learning techniques such as adversarial motion priors can also be used to learn a reward function to encourage the policy to produce motions that look similar to a prescribed motion dataset, e.g., [18], [19]. There are various ways to obtain a reference motion, e.g., trajectories optimization [20], [17], motion capture data from animals [15], [18] or even crude hand design motions [16], [21], [22]. In this paper, we use a tracking-based reward to generate highly dynamics behaviors.…”
Section: Imitation-based Reinforcement Learning For Legged Robotsmentioning
confidence: 99%
See 1 more Smart Citation
“…Inverse reinforcement learning techniques such as adversarial motion priors can also be used to learn a reward function to encourage the policy to produce motions that look similar to a prescribed motion dataset, e.g., [18], [19]. There are various ways to obtain a reference motion, e.g., trajectories optimization [20], [17], motion capture data from animals [15], [18] or even crude hand design motions [16], [21], [22]. In this paper, we use a tracking-based reward to generate highly dynamics behaviors.…”
Section: Imitation-based Reinforcement Learning For Legged Robotsmentioning
confidence: 99%
“…Therefore, the observation for the RL agent consists of the joint angles and angular velocities, as well as the base orientation represented as a quaternion. Note that past work demonstrating dynamic behaviours on the Solo 8 and Solo 12 Robots [22], [30] relied on external motion capture systems to provide high fidelity base pose estimates, which we do not use here. As per [26], we additionally augment the observation with the phase represented as o phase = cos(2πφ/T ), sin(2πφ/T ) .…”
Section: Imitation-based Reinforcement Learningmentioning
confidence: 99%