2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022
DOI: 10.1109/iros47612.2022.9981973
|View full text |Cite
|
Sign up to set email alerts
|

Adversarial Motion Priors Make Good Substitutes for Complex Reward Functions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 41 publications
(24 citation statements)
references
References 39 publications
0
24
0
Order By: Relevance
“…Our method is based on imitation-RL method AMP Peng et al (2021), capable of learning complex naturalistic motion on humanoid skeletons and showing good transferrability of simulation-learned policy to real-world robots Escontrela et al (2022), Vollenweider et al (2022). AMP is a subsequent work of Generative Adversarial Imitation learning Ho and Ermon (2016), that takes one or multiple clips of reference motion and learns a motor control policy π θ that imitates the motion dynamics of the reference (s) through a discriminator network D ϕ .…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our method is based on imitation-RL method AMP Peng et al (2021), capable of learning complex naturalistic motion on humanoid skeletons and showing good transferrability of simulation-learned policy to real-world robots Escontrela et al (2022), Vollenweider et al (2022). AMP is a subsequent work of Generative Adversarial Imitation learning Ho and Ermon (2016), that takes one or multiple clips of reference motion and learns a motor control policy π θ that imitates the motion dynamics of the reference (s) through a discriminator network D ϕ .…”
Section: Methodsmentioning
confidence: 99%
“…Our method is based on imitation-RL method AMP Peng et al. (2021) , capable of learning complex naturalistic motion on humanoid skeletons and showing good transferrability of simulation-learned policy to real-world robots Escontrela et al. (2022) , Vollenweider et al.…”
Section: Methodsmentioning
confidence: 99%
“…For convenience, it will be referred to as 'complex rewards' in the following text. 2) Policy trained with method from Escontrela et al [5], using adversarial motion priors as style reward. For convenience, it will be referred to as 'amp' in the following text.…”
Section: A Quantitative Analysismentioning
confidence: 99%
“…[11] trained a neural network state estimator to estimate robot states that cannot be directly read from sensory data. [5] used AMP to train control policies for a quadrupedal robot and showed that AMP makes good substitutes for complex reward functions. [12] trained reinforcement learning controller using unsupervised skill discovery and transferred it to a real quadruped robot.…”
Section: Deep Reinforcement Learning For Legged Locomotionmentioning
confidence: 99%
See 1 more Smart Citation