2021
DOI: 10.48550/arxiv.2102.07920
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Training Larger Networks for Deep Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
11
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 18 publications
1
11
0
Order By: Relevance
“…Secondly, the network was made deeper, for reasons that will be detailed later. Deeper networks however tend to overfit and therefore to be unhelpful in DRL [ 28 ]. In computer vision, this problem is addressed with batch normalization, which was shown to smooth the optimization landscape, stabilizing gradient estimation [ 29 ].…”
Section: Experiments and Discussionmentioning
confidence: 99%
“…Secondly, the network was made deeper, for reasons that will be detailed later. Deeper networks however tend to overfit and therefore to be unhelpful in DRL [ 28 ]. In computer vision, this problem is addressed with batch normalization, which was shown to smooth the optimization landscape, stabilizing gradient estimation [ 29 ].…”
Section: Experiments and Discussionmentioning
confidence: 99%
“…Specifically, we perform an in-depth comparison of the performance of PQCs and NNs with varying numbers of parameters on the Cart Pole environment. We show that recent results in classical deep Qlearning also apply to the case when a PQC is used as the function approximator, namely that increasing the number of parameters is only beneficial up to some point [52]. After this, learning becomes increasingly unstable for both PQCs and NNs.…”
Section: Introductionmentioning
confidence: 86%
“…How-ever, the improvement between 10 and 15 layers is relatively compared to that between 5 and 10 layers, similar to a saturation in performance w.r.t. number of parameters found in classical deep RL [52]. We will study this type of scaling behaviour more in-depth and compare it to that of NNs in section 5.2.…”
Section: Frozen Lakementioning
confidence: 95%
“…To avoid this issue, instead of using existing models available in the Internet, we train a new encoder from scratch with images from ImageNet shrunk to 84x84. To improve computation efficiency and to avoid difficulties in training deep networks in DRL (Bjorck et al, 2021;Ota et al, 2021), we use a light-weight encoder with only 5 convolutional layers, which is 50 times smaller then ResNet34 used in RRL. This allows us to perform experimentation at a much faster pace.…”
Section: Stage 1: Pretraining With Non-rl Datamentioning
confidence: 99%