2020
DOI: 10.48550/arxiv.2006.00979
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Acme: A Research Framework for Distributed Reinforcement Learning

Abstract: Deep reinforcement learning has led to many recent-and groundbreaking-advancements. However, these advances have often come at the cost of both the scale and complexity of the underlying RL algorithms. Increases in complexity have in turn made it more difficult for researchers to reproduce published RL algorithms or rapidly prototype ideas. To address this, we introduce Acme, a tool to simplify the development of novel RL algorithms that is specifically designed to enable simple agent implementations that can … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
72
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 47 publications
(72 citation statements)
references
References 25 publications
0
72
0
Order By: Relevance
“…The first part is parallel actors, which are used to interact with environment and generate data; The second component is parallel learners that consume data for policy training; The third and fourth parts are distributed neural network and store of experience to connect the actor and learner. Based on the above framework, a number of advanced distributed reinforcement learning frameworks are developed, and data throughput is largely improved [36], [37], [38]. In Suphx and DouZero, distributed learning is adopted to accelerate RL training, where multiple rollouts are paralleled performed to collect data.…”
Section: Basic Techniques For Suphx and Douzeromentioning
confidence: 99%
“…The first part is parallel actors, which are used to interact with environment and generate data; The second component is parallel learners that consume data for policy training; The third and fourth parts are distributed neural network and store of experience to connect the actor and learner. Based on the above framework, a number of advanced distributed reinforcement learning frameworks are developed, and data throughput is largely improved [36], [37], [38]. In Suphx and DouZero, distributed learning is adopted to accelerate RL training, where multiple rollouts are paralleled performed to collect data.…”
Section: Basic Techniques For Suphx and Douzeromentioning
confidence: 99%
“…We train a DQN agent on top of the discretization learned by the AQuaDem framework. The architecture of the Q-network we use is the default LayerNorm architecture from the Q-network of the ACME library [Hoffman et al, 2020], which consists in a hidden layer of size 512 with layer normalization and tanh activation, followed by two hidden layers of sizes 512 and 256 with elu activation. We explored multiple Q-value losses for which we used the Adam optimizer: regular DQN [Mnih et al, 2015], double DQN with experience replay [Van Hasselt et al, 2016, Schaul et al, 2016, and Munchausen DQN [Vieillard et al, 2020]; the latter led to the best performance.…”
Section: D41 Aquadqnmentioning
confidence: 99%
“…To train agents via behavioral cloning [57], we use the open-source Acme [29] to learn a policy from human gameplay data. Specifically, we collected 5 human-human trajectories of length 1200 time steps for each of the 5 layouts, resulting in 60k total environment steps.…”
Section: Implementation Detailsmentioning
confidence: 99%