2022
DOI: 10.48550/arxiv.2202.10324
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning

Abstract: We propose a simple but powerful data-driven framework for solving highly challenging visual deep reinforcement learning (DRL) tasks. We analyze a number of major obstacles in taking a data-driven approach, and present a suite of design principles, training strategies, and critical insights about data-driven visual DRL. Our framework has three stages: in stage 1, we leverage non-RL datasets (e.g. ImageNet) to learn task-agnostic visual representations; in stage 2, we use offline RL data (e.g. a limited number … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…Pretraining backbone vision components is common in RL setups, since it often yields a higher end performance [25], [17], [32], [37], [39]. We therefore pretrain the patch embedder in the same self-supervised fashion as [7].…”
Section: B Airloc Modelmentioning
confidence: 99%
“…Pretraining backbone vision components is common in RL setups, since it often yields a higher end performance [25], [17], [32], [37], [39]. We therefore pretrain the patch embedder in the same self-supervised fashion as [7].…”
Section: B Airloc Modelmentioning
confidence: 99%
“…We instead use action-conditioned self-supervised learning to learn task-relevant representation. Several works pretrain policies on offline demonstrations [38,45]. However, the training data is from the policy learning environment and requires expert policy to collect trajectories, limiting the scalability and efficiency.…”
Section: Related Work 21 Image-based Policy Learningmentioning
confidence: 99%
“…Besides the joint learning framework used in CURL and SAC+AE, Shelhamer et al [67] investigate a pretraining framework to combine SSL with RL, and use self-supervised loss as an intrinsic reward to further boost performance during online learning. Recent works on policy learning (e.g., [56,65,74,80,86]) also take advantage of the self-supervised learning in a multi-step framework and show its great potential in solving challenging visual-based problems.…”
Section: Observation On Pretraining Frameworkmentioning
confidence: 99%
“…MVP [80] follows MAE [32] to train a visual encoder and even outperforms supervised pre-training. RRL [65] and VRL3 [74] also benefit from pre-training a deeper visual encoder on large datasets like ImageNet [16].…”
Section: Related Workmentioning
confidence: 99%