2020
DOI: 10.48550/arxiv.2011.07537
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0
1

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(8 citation statements)
references
References 0 publications
0
7
0
1
Order By: Relevance
“…Figure 4 presents the training curves of the each of the reduced rule sets, and Figure 5 shows the training curves of the rest of the models. For comparisons with different reinforcement learning algorithms in the standard AntBullet environment see Pardo [38] for baselines where, e.g., Proximal Policy Optimization (PPO) achieves a score of around 3100, Deep Deterministic Policy Gradient (DDPG) scores around 2500, and Advantage Actor-Critic (A2C) scores around 1800.…”
Section: Resultsmentioning
confidence: 99%
“…Figure 4 presents the training curves of the each of the reduced rule sets, and Figure 5 shows the training curves of the rest of the models. For comparisons with different reinforcement learning algorithms in the standard AntBullet environment see Pardo [38] for baselines where, e.g., Proximal Policy Optimization (PPO) achieves a score of around 3100, Deep Deterministic Policy Gradient (DDPG) scores around 2500, and Advantage Actor-Critic (A2C) scores around 1800.…”
Section: Resultsmentioning
confidence: 99%
“…Libraries Neural networks were implemented in PyTorch (Paszke et al, 2019). The RL algorithms were implemented using Tonic (Pardo, 2021). The ES algorithm was implemented using ES Torch (Karakasli, 2020).…”
Section: A Experimental Detailsmentioning
confidence: 99%
“…Dopamine [4] is focusing on DQN variants [28,6,19] to make them available for researchers as baseline implementations. Tonic [31] aims to provide many of continuous control algorithms with large-scale benchmark results. There are two design differences between d3rlpy and the existing libraries.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, the non-standardized implementations makes it difficult for researchers to track the exact implementation difference between algorithms. On the other hand, there are already many libraries that provide a collection of deep RL algorithms [15,4,31]. However, they are designed for online RL paradigms and not providing complete supports for offline RL in terms of algorithms and interfaces.…”
Section: Introductionmentioning
confidence: 99%