2018
DOI: 10.48550/arxiv.1809.05676
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deterministic Implementations for Reproducibility in Deep Reinforcement Learning

Abstract: While deep reinforcement learning (DRL) has led to numerous successes in recent years, reproducing these successes can be extremely challenging. One reproducibility challenge particularly relevant to DRL is nondeterminism in the training process, which can substantially affect the results. Motivated by this challenge, we study the positive impacts of deterministic implementations in eliminating nondeterminism in training. To do so, we consider the particular case of the deep Q-learning algorithm, for which we … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
20
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 14 publications
(25 citation statements)
references
References 1 publication
0
20
0
Order By: Relevance
“…With the success of deep learning, the generalization properties of deep networks received a renewed interest in recent years [24,6,55,28]. [11,56] establish spectrally normalized risk bounds for deep networks and [54] provides refined bounds by exploiting inter-layer Jacobian. [6] proposes tighter bounds using compression techniques.…”
Section: Related Workmentioning
confidence: 99%
“…With the success of deep learning, the generalization properties of deep networks received a renewed interest in recent years [24,6,55,28]. [11,56] establish spectrally normalized risk bounds for deep networks and [54] provides refined bounds by exploiting inter-layer Jacobian. [6] proposes tighter bounds using compression techniques.…”
Section: Related Workmentioning
confidence: 99%
“…We deploy two LeNet-based neural network architectures which differs only by the number of neurons in two of the layers in order to individually match the formats of the MNIST and CIFAR-10 datasets. Our TensorFlow code for the Delta method is based on the pydeepdelta Python module [14], and is fully deterministic [10]. The corresponding Bootstrap implementation can be found in the same repository.…”
Section: The Neural Network Classifiersmentioning
confidence: 99%
“…The objective for deep models, on the other hand, will have multiple optima, and many which have roughly equal loss in average over all test examples, but differ on the individual predictions they provide to an example. For such models, nondeterminism in training may lead optimizers to different optima (Summers & Dinneen, 2021) (see also Nagarajan et al (2018)), that depend on the training randomness (Achille et al, 2017;Bengio et al, 2009).…”
Section: Introductionmentioning
confidence: 99%