2019
DOI: 10.1016/j.robot.2018.11.004
|View full text |Cite
|
Sign up to set email alerts
|

Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
68
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 149 publications
(71 citation statements)
references
References 17 publications
2
68
0
1
Order By: Relevance
“…Their method requires prior information about the shape of the object in order to reconstruct the missing market points. More recent work in using (deep) reinforcement learning for robotic folding also used vision-based methods to define the reward function for the agent [9,12]. However, relying solely on visual inputs and marker clues does not scale well.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Their method requires prior information about the shape of the object in order to reconstruct the missing market points. More recent work in using (deep) reinforcement learning for robotic folding also used vision-based methods to define the reward function for the agent [9,12]. However, relying solely on visual inputs and marker clues does not scale well.…”
Section: Related Workmentioning
confidence: 99%
“…Hand-engineered heuristics to estimate the cloth state have been researched [8] but are highly complex to reproduce and are prone to error. Visual cues such as colored surfaces or fiducial markers [11,12] can be attached to highly deformable materials but require the many occlusions caused by the self-collision of the cloth to be handled.…”
Section: Fitted Q-learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Emphasis was placed on the learning of the robotic arms, and deep reinforcement learning was investigated to learn complex policies through high-level observations such as typing. Because deep reinforcement learning requires a large number of training samples, a method that improves the sample efficiency and learning stability with fewer samples by combining the characteristics of smooth policy updates with automatic feature extraction of deep neural networks was proposed [21]. Similarly, there has been a study involving learning robots that pick up and classify objects.…”
Section: Rl X Systemmentioning
confidence: 99%
“…It is experimentally shown that the dueling network architecture converges faster than the conventional single-stream network architecture (Wang et al, 2016;Tsurumine et al, 2019). Another important technological advance is entropy-based regularization (Ziebart et al, 2008;Mnih et al, 2016) that has been shown to improve both exploration and robustness, by adding the entropy of policy to the reward function.…”
Section: Techniques Together With Dqnmentioning
confidence: 99%