Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2015
DOI: 10.48550/arxiv.1511.06342
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning

Abstract: The ability to act in multiple environments and transfer previous knowledge to new situations can be considered a critical aspect of any intelligent agent. Towards this goal, we define a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains. This method, termed "Actor-Mimic", exploits the use of deep reinforcement learning and model compression techniques to train a single policy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
132
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 96 publications
(139 citation statements)
references
References 4 publications
2
132
0
Order By: Relevance
“…Learning with multiple objectives are shown to be beneficial in DRL tasks [Wilson et al, 2007, Pinto and Gupta, 2017, Hausman et al, 2018. Sharing parameters across tasks [Parisotto et al, 2015, Rusu et al, 2015, Teh et al, 2017 usually results in conflicting gradients from different tasks. One way to mitigate this is to explicitly model the similarity between gradients obtained from different tasks [Yu et al, 2020, Zhang and Yeung, 2014, Kendall et al, 2018, Lin et al, 2019, Sener and Koltun, 2018, Du et al, 2018.…”
Section: Related Workmentioning
confidence: 99%
“…Learning with multiple objectives are shown to be beneficial in DRL tasks [Wilson et al, 2007, Pinto and Gupta, 2017, Hausman et al, 2018. Sharing parameters across tasks [Parisotto et al, 2015, Rusu et al, 2015, Teh et al, 2017 usually results in conflicting gradients from different tasks. One way to mitigate this is to explicitly model the similarity between gradients obtained from different tasks [Yu et al, 2020, Zhang and Yeung, 2014, Kendall et al, 2018, Lin et al, 2019, Sener and Koltun, 2018, Du et al, 2018.…”
Section: Related Workmentioning
confidence: 99%
“…MTL eases deep learning's need for huge amounts of training data and to start learning each new task from scratch. Training shared parameters on multiple tasks allows for supervision of one task to aid in the learning for another, and a set of trained shared features can often be reused to instantiate learning on a new task, yielding faster learning through feature reuse [28]. However, achieving increased data efficiency, transfer robustness, and regularization from MTL is never guaranteed and highly dependent on the relationships between the tasks involved [36,2].…”
Section: Related Workmentioning
confidence: 99%
“…In fact, it is worth noting that several works describing the benefits of TL in RL do exist (but they all differ from the study presented in this work): Tirinzoni et al (2018) show that it possible to successfully transfer value functions across tasks, yet their work does not consider deep networks as function approximators but rather Gaussian mixtures. Parisotto et al (2015) show that it can be beneficial to fine-tune a pre-trained DRL agent, but they consider multi-task learning and policy gradient algorithms as a way of pre-training. Rusu et al (2016) also show that fine-tuning can be beneficial, but in the context of progressive networks and again of policy gradient techniques.…”
Section: Related Work and Conclusionmentioning
confidence: 99%