2019
DOI: 10.1016/j.robot.2019.06.007
|View full text |Cite
|
Sign up to set email alerts
|

End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 40 publications
(24 citation statements)
references
References 25 publications
0
24
0
Order By: Relevance
“…Notably, the system has to forgo access to raw sensor data to avoid the gap between simulation and reality. The authors addressed this issue in their next work [163]. They trained policies end to end by using the deep Q-learning algorithm with CNN, which maps raw pixels as a state-action value then transfers the policy to a real robotic application with supervised examples.…”
Section: Simulation-to-real-world Transfermentioning
confidence: 99%
“…Notably, the system has to forgo access to raw sensor data to avoid the gap between simulation and reality. The authors addressed this issue in their next work [163]. They trained policies end to end by using the deep Q-learning algorithm with CNN, which maps raw pixels as a state-action value then transfers the policy to a real robotic application with supervised examples.…”
Section: Simulation-to-real-world Transfermentioning
confidence: 99%
“…Reinforcement learning was not the focus in the early stage, but with Google's successful application in Atari and Go games, this branch of machine learning has attracted much attention. With the development of deep reinforcement learning, researchers have combined it with machine vision [138][139][140][141][142] in the hope of removing the need for labeled data and artificial means to achieve intelligence.…”
Section: Different Machine Vision Algorithms Without Labeled Datamentioning
confidence: 99%
“…The autonomous solutions to the reaching through clutter problem can be categorized into three groups: There are sampling-based planning approaches [5], [6], [9], trajectory optimization based approaches [3], [14], and learning-based approaches [4], [7], [15], [16]. While these approaches show varying degrees of success, the difficult instances of this problem are still challenging for autonomous systems, due to the problem being high-dimensional and under-actuated, and also due to real-world physics uncertainty.…”
Section: Related Workmentioning
confidence: 99%