2022
DOI: 10.1016/j.robot.2022.104264
|View full text |Cite
|
Sign up to set email alerts
|

Goal-aware generative adversarial imitation learning from imperfect demonstration for robotic cloth manipulation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 26 publications
0
6
0
Order By: Relevance
“…In order to robustly learn desirable policies even from subpar demonstrations, Tsurumine et al [ 122 ] propose goal-aware generative adversarial imitation learning (GA-GAIL), which uses a second discriminator to distinguish the goal state in parallel with the first discriminator that indicates the demonstration data. To accomplish stable policy learning from two discriminators, GA-GAIL uses the entropy-maximizing deep P-network (EDPN) as a generator.…”
Section: Deep Rl For Robotic Manipulationmentioning
confidence: 99%
“…In order to robustly learn desirable policies even from subpar demonstrations, Tsurumine et al [ 122 ] propose goal-aware generative adversarial imitation learning (GA-GAIL), which uses a second discriminator to distinguish the goal state in parallel with the first discriminator that indicates the demonstration data. To accomplish stable policy learning from two discriminators, GA-GAIL uses the entropy-maximizing deep P-network (EDPN) as a generator.…”
Section: Deep Rl For Robotic Manipulationmentioning
confidence: 99%
“…However, the number of bins needed to approximate a continuous action space grows exponentially with increasing dimensionality. Generative adversarial networks [27,76,16], variational autoencoders [47] and combined Categorical and Gaussian distributions [66,23,14] have been used to learn from multimodal demonstrations. Nevertheless, these models tend to be sensitive to hyperparameters, such as the number of clusters used [66].…”
Section: Learning Robot Manipulation Policies From Demonstrationsmentioning
confidence: 99%
“…Indeed, human demonstrations often contain diverse ways that a task can be accomplished. A natural choice is then to treat policy learning as a distribution learning problem: instead of representing a policy as a deterministic map π θ (x), learn the entire distribution of actions conditioned on the current robot state p(y|x) [27,76,25,66]. Recent works use diffusion models for learning such state-conditioned action distributions for robot manipulation policies from demonstrations [54,11,60] and show they outperform deterministic or other alternatives, such as variational autoencoders [47], mixture of Gaussians † Equal contribution Fig.…”
Section: Introductionmentioning
confidence: 99%
“…Till now network-based imitation learning methods involve sampling human-operated skills and then training neural networks using the acquired sample data to achieve skill learning. As of now, neural network-based imitation learning methods primarily include BC methods (Li et al, 2022), where neural networks are directly trained, and Generative Adversarial Imitation Learning (GAIL) methods (Tsurumine and Matsubara, 2022), which approximate strategies through generative adversarial techniques. Neural network-based imitation learning methods excel in strategy approximation.…”
mentioning
confidence: 99%