2022
DOI: 10.48550/arxiv.2205.14495
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Task-Agnostic Continual Reinforcement Learning: In Praise of a Simple Baseline

Abstract: We study task-agnostic continual reinforcement learning (TACRL) in which standard RL challenges are compounded with partial observability stemming from task agnosticism, as well as additional difficulties of continual learning (CL), i.e., learning on a non-stationary sequence of tasks. Here we compare TACRL methods with their soft upper bounds prescribed by previous literature: multi-task learning (MTL) methods which do not have to deal with non-stationary data distributions, as well as task-aware methods, whi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 30 publications
0
3
0
Order By: Relevance
“…On the other hand, our work is about learning tasks sequentially with only the current task data available for learning. Some very recent works try to apply continual RL on robotics manipulation tasks [22], [4]. [22] introduces a continual learning benchmark for robotic manipulation tasks.…”
Section: A Related Workmentioning
confidence: 99%
“…On the other hand, our work is about learning tasks sequentially with only the current task data available for learning. Some very recent works try to apply continual RL on robotics manipulation tasks [22], [4]. [22] introduces a continual learning benchmark for robotic manipulation tasks.…”
Section: A Related Workmentioning
confidence: 99%
“…It is important to note that those ratings classify the complexity of a given scenario. For example, the default scenario (classincremental) assesses a complexity level of (0 1), domain incremental without task labels would be (0 2), task agnostic continual RL [47] (1 3). To validate the autonomy of an approach, it should be evaluated in adequate scenarios.…”
Section: Classifying the Autonomy Level Of CL Algorithmsmentioning
confidence: 99%
“…A variety of approaches has been published, among which knowledge-based distillations [27,28] and context-based decompositions [105,106] are popular. Other works are concerned with the employed model [107,108,109,110], off-policy algorithms [111], policy gradient [112] or a task-agnostic perspective [47]. Evaluations of known CL methods (e.g., GEM, A-GEM, and replay) are also applied in the RL domain [113,9].…”
Section: Existing Approachesmentioning
confidence: 99%