Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system

Lowrey, Kendall; Kolev, Svetoslav; Dao, Jeremy; Rajeswaran, Aravind; Todorov, Emanuel

doi:10.1109/simpar.2018.8376268

Cited by 56 publications

(51 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Combination of system identification and dynamics randomization has been used in the past to learn locomotion for a real quadruped [26], non-prehensile object manipulation [27] and in-hand object pivoting [28]. In our work, we recognize domain randomization and system identification as powerful tools for training general policies in simulation.…”

Section: Related Workmentioning

confidence: 99%

Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience

Chebotar

Handa

Makoviychuk

et al. 2019

2019 International Conference on Robotics and Automation (ICRA)

349

230

View full text Add to dashboard Cite

Fig. 1. Policies for opening a cabinet drawer and swing-peg-in-hole tasks trained by alternatively performing reinforcement learning with multiple agents in simulation and updating simulation parameter distribution using a few real world policy executions.Abstract-We consider the problem of transferring policies to the real world by training on a distribution of simulated scenarios. Rather than manually tuning the randomization of simulations, we adapt the simulation parameter distribution using a few real world roll-outs interleaved with policy training. In doing so, we are able to change the distribution of simulations to improve the policy transfer by matching the policy behavior in simulation and the real world. We show that policies trained with our method are able to reliably transfer to different robots in two real world tasks: swing-peg-in-hole and opening a cabinet drawer. The video of our experiments can be found at https: //sites.google.com/view/simopt.

show abstract

Section: Related Workmentioning

confidence: 99%

Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience

Chebotar

Handa

Makoviychuk

et al. 2019

2019 International Conference on Robotics and Automation (ICRA)

349

230

View full text Add to dashboard Cite

show abstract

“…Manual parameter tuning is another form of simulator modification that can be done prior to applying reinforcement learning. Lowrey et al (2018) manually identify simulation parameters before applying policy gradient reinforcement learning to learn to push an object to target positions. Tan et al (2018) perform similar system identification (including disassembling the robot and making measurements of each part) and adding action latency modeling before using deep reinforcement learning to learn quadrapedal walking.…”

Section: Simulator Modificationmentioning

confidence: 99%

“…Domain randomization produces policies that are robust enough to transfer to the real world. An alternative approach that does not involve randomness is to learn policies that perform well under an ensemble of different simulators (Boeing & Bräunl, 2012;Rajeswaran et al, 2017;Lowrey et al, 2018). Pinto et al, (2017b) simultaneously learn an adversary that can perturb the learning agent's actions while it learns in simulation.…”

Section: Robustness Through Simulator Variancementioning

confidence: 99%

Grounded action transformation for sim-to-real reinforcement learning

et al. 2021

View full text Add to dashboard Cite

Reinforcement learning in simulation is a promising alternative to the prohibitive sample cost of reinforcement learning in the physical world. Unfortunately, policies learned in simulation often perform worse than hand-coded policies when applied on the target, physical system. Grounded simulation learning (gsl) is a general framework that promises to address this issue by altering the simulator to better match the real world (Farchy et al. 2013 in Proceedings of the 12th international conference on autonomous agents and multiagent systems (AAMAS)). This article introduces a new algorithm for gsl—Grounded Action Transformation (GAT)—and applies it to learning control policies for a humanoid robot. We evaluate our algorithm in controlled experiments where we show it to allow policies learned in simulation to transfer to the real world. We then apply our algorithm to learning a fast bipedal walk on a humanoid robot and demonstrate a 43.27% improvement in forward walk velocity compared to a state-of-the art hand-coded walk. This striking empirical success notwithstanding, further empirical analysis shows that gat may struggle when the real world has stochastic state transitions. To address this limitation we generalize gat to the stochasticgat (sgat) algorithm and empirically show that sgat leads to successful real world transfer in situations where gat may fail to find a good policy. Our results contribute to a deeper understanding of grounded simulation learning and demonstrate its effectiveness for applying reinforcement learning to learn robot control policies entirely in simulation.

show abstract

“…for v ∈ V do 10 P r .append(min-vertex(v,O)) 11 for v ∈ P r do 12 min d← ∞ 13 for p ∈ V c do 14 d← G(v, p) 15 if d<min d then 16 min d←d 17 if min d>max d then 18 max d←min d 19 return max d with the object, and the mesh of the desired contact region C d . To compute these metrics, we first project the desired contact region C d and the robot meshes L i∈N onto the object mesh O as shown in lines 5-10.…”

Section: Benchmark Guidelines a Scoringmentioning

confidence: 99%

Benchmarking In-Hand Manipulation

Cruciani

Sundaralingam

Hang

et al. 2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

The purpose of this benchmark is to evaluate the planning and control aspects of robotic in-hand manipulation systems. The goal is to assess the system's ability to change the pose of a hand-held object by either using the fingers, environment or a combination of both. Given an object surface mesh from the YCB data-set, we provide examples of initial and goal states (i.e. static object poses and fingertip locations) for various in-hand manipulation tasks. We further propose metrics that measure the error in reaching the goal state from a specific initial state, which, when aggregated across all tasks, also serves as a measure of the system's in-hand manipulation capability. We provide supporting software, task examples, and evaluation results associated with the benchmark.

show abstract

Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system

Cited by 56 publications

References 41 publications

Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience

Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience

Grounded action transformation for sim-to-real reinforcement learning

Benchmarking In-Hand Manipulation

Contact Info

Product

Resources

About