2019
DOI: 10.48550/arxiv.1909.11730
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

"Good Robot!": Efficient Reinforcement Learning for Multi-Step Visual Tasks with Sim to Real Transfer

Andrew Hundt,
Benjamin Killeen,
Nicholas Greene
et al.

Abstract: In order to learn effectively, robots must be able to extract the intangible context by which task progress and mistakes are defined. In the domain of reinforcement learning, much of this information is provided by the reward function. Hence, reward shaping is a necessary part of how we can achieve state-of-the-art results on complex, multi-step tasks. However, comparatively little work has examined how reward shaping should be done so that it captures task context, particularly in scenarios where the task is … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 19 publications
(38 reference statements)
0
3
0
Order By: Relevance
“…There are two classes of DA: instance-based and feature-based. DA is utilized in (Bousmalis et al 2018;Yan et al 2017;Hundt et al 2020) research to optimize robot grasping. Research has shown that the DA method allows the model to learn a mapping from source to target domain (Tobin et al 2017).…”
Section: Related Workmentioning
confidence: 99%
“…There are two classes of DA: instance-based and feature-based. DA is utilized in (Bousmalis et al 2018;Yan et al 2017;Hundt et al 2020) research to optimize robot grasping. Research has shown that the DA method allows the model to learn a mapping from source to target domain (Tobin et al 2017).…”
Section: Related Workmentioning
confidence: 99%
“…The benefit of those models is their ability to project a goal image and their current observation into their feature space and compute a path towards the target feature for visual servo-ing (Watter et al, 2015;Byravan et al, 2018), reaching and pushing (Srinivas et al, 2018;Yu et al, 2019) with gradientbased optimisation methods. Visuomotor controllers trained in the reinforcement learning paradigm typically model the distance to a desired, visually specified goal via reward functions which can be either shaped explicitly based on expert domain knowledge (Hundt et al, 2019) or implicitly learned from user feedback about task success (Singh et al, 2019). Our approach of using dynamic images for goal distance estimation sets itself apart from these methods as it uses dynamic images as an efficient, non-parametric conditioning scheme.…”
Section: Related Workmentioning
confidence: 99%
“…PyBullet and MuJoCo, on the other hand, present wider integration with DL and RL libraries and gym environments. In In those cases where system identification for one-shot transfer is the objective, researchers have often built or customized specific simulations that meet problem-specific requirements and constraints [32], [36], [41].…”
Section: F Simulation Environmentsmentioning
confidence: 99%