2020
DOI: 10.48550/arxiv.2012.13490
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards Continual Reinforcement Learning: A Review and Perspectives

Abstract: In this article, we aim to provide a literature review of different formulations and approaches to continual reinforcement learning (RL), also known as lifelong or non-stationary RL. We begin by discussing our perspective on why RL is a natural fit for studying continual learning. We then provide a taxonomy of different continual RL formulations and mathematically characterize the non-stationary dynamics of each setting. We go on to discuss evaluation of continual RL agents, providing an overview of benchmarks… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
42
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(42 citation statements)
references
References 205 publications
0
42
0
Order By: Relevance
“…The setting we study in our work shares conceptual similarities with prior work in continual and lifelong learning (Schmidhuber, 1987;Thrun & Mitchell, 1995;Parisi et al, 2019;Hadsell et al, 2020). In context of reinforcement learning, this work has studied the problem of episodic learning in sequential MDPs (Khetarpal et al, 2020; Second, the continuing setting (bottom row, (2)), where a floor cleaning robot is tasked with keeping a floor clean and is only evaluated on its cumulative performance (Eq. 2) over the agent's lifetime.…”
Section: Related Workmentioning
confidence: 98%
“…The setting we study in our work shares conceptual similarities with prior work in continual and lifelong learning (Schmidhuber, 1987;Thrun & Mitchell, 1995;Parisi et al, 2019;Hadsell et al, 2020). In context of reinforcement learning, this work has studied the problem of episodic learning in sequential MDPs (Khetarpal et al, 2020; Second, the continuing setting (bottom row, (2)), where a floor cleaning robot is tasked with keeping a floor clean and is only evaluated on its cumulative performance (Eq. 2) over the agent's lifetime.…”
Section: Related Workmentioning
confidence: 98%
“…The type of knowledge that is transferred are policies learned in source tasks which are re-evaluated in the target task and recombined using the GPI procedure. A natural use-case for ξ-learning are continual problems (Khetarpal et al, 2020) where an agent has continually adapt to changing tasks, which are in our setting different reward functions.…”
Section: Related Workmentioning
confidence: 99%
“…time. Non-stationarity can arise from diverse causes and can be interpreted as a form of partial knowledge on environment (Khetarpal et al 2020). Learning in nonstationary environments has been diffusely addressed in the literature (Garcia and Smith 2000;Ghate and Smith 2013;Lesner and Scherrer 2015).…”
Section: Introductionmentioning
confidence: 99%
“…In this sense, Lifelong Learning (LL) can be considered closer to the intuitive idea of learning for human agents. More technically, LL requires the agent to readily adapt its behavior to the environment evolution, as well as keeping memory of past behaviors in order to leverage this knowledge on future similar phases (Khetarpal et al 2020). This represents, indeed, a critical trade-off, peculiar of the lifelong setting.…”
Section: Introductionmentioning
confidence: 99%