Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2021
DOI: 10.48550/arxiv.2104.01655
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation

Abstract: Many real-world applications such as robotics provide hard constraints on power and compute that limit the viable model complexity of Reinforcement Learning (RL) agents. Similarly, in many distributed RL settings, acting is done on unaccelerated hardware such as CPUs, which likewise restricts model size to prevent intractable experiment run times. These "actor-latency" constrained settings present a major obstruction to the scaling up of model complexity that has recently been extremely successful in supervise… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
8
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 14 publications
0
8
0
Order By: Relevance
“…Parisotto et al (2020) address the problem of using transformers in RL and showed that adding gating layers on top of the transformers layers can stabilize training. Subsequent works addressed the increased computational load of using a transformer for an agent's policy (Irie et al, 2021;Parisotto & Salakhutdinov, 2021). Chen et al (2021); Janner et al (2021) take a different approach by modeling the RL problem as a sequence modeling problem and use a transformer to predict actions without additional networks for an actor or critic.…”
Section: Related Workmentioning
confidence: 99%
“…Parisotto et al (2020) address the problem of using transformers in RL and showed that adding gating layers on top of the transformers layers can stabilize training. Subsequent works addressed the increased computational load of using a transformer for an agent's policy (Irie et al, 2021;Parisotto & Salakhutdinov, 2021). Chen et al (2021); Janner et al (2021) take a different approach by modeling the RL problem as a sequence modeling problem and use a transformer to predict actions without additional networks for an actor or critic.…”
Section: Related Workmentioning
confidence: 99%
“…In light of the above works, researchers are tempted to investigate the benefit of transformer models in improving reinforcement learning performance. The first line of work applies the transformer model to represent the component in standard RL algorithms, such as policy, models and value functions (Parisotto et al, 2020;Parisotto & Salakhutdinov, 2021). Instead of this, the second line of work (Chen et al, 2021;Janner et al, 2021b) abstracts RL as a sequence modelling problem and efficiently utilize the existing transformer framework widely used in language modelling to solve the RL problem.…”
Section: A Related Workmentioning
confidence: 99%
“…Learned sparse attention mechanisms combined with feed-forward neural networks represent exciting alternatives for training RNNs. The best way to use attention strategies for partially observable reinforcement learning is still evolving (Parisotto et al, 2020b;Parisotto & Salakhutdinov, 2021;Loynd et al, 2020;Chen et al, 2021;Janner et al, 2021). Chen et al (2021) and Janner et al (2021) use transformers in the offline reinforcement learning setting.…”
Section: Related Workmentioning
confidence: 99%