Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence 2017
DOI: 10.24963/ijcai.2017/353
|View full text |Cite
|
Sign up to set email alerts
|

Autonomous Task Sequencing for Customized Curriculum Design in Reinforcement Learning

Abstract: Transfer learning is a method where an agent reuses knowledge learned in a source task to improve learning on a target task. Recent work has shown that transfer learning can be extended to the idea of curriculum learning, where the agent incrementally accumulates knowledge over a sequence of tasks (i.e. a curriculum). In most existing work, such curricula have been constructed manually. Furthermore, they are fixed ahead of time, and do not adapt to the progress or abilities of the agent. In this paper, we form… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
92
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 77 publications
(93 citation statements)
references
References 10 publications
1
92
0
Order By: Relevance
“…For GridWorld, Experiment 3 ( Figure 3) has parameters n := |T | = 12 and L = 4, while in Experiment 4 n = 7, and L = 7. For both domains, the intermediate tasks have been generated manually using methods from Narvekar et al [2017], by varying the size of the environment, and adding and removing elements (pits and fires in GridWorld, and columns and movable blocks in BlockDude). We intentionally created both tasks that provide positive and negative transfer towards the final task, in order to test the ability of the sequencing algorithm to choose the most appropriate ones.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…For GridWorld, Experiment 3 ( Figure 3) has parameters n := |T | = 12 and L = 4, while in Experiment 4 n = 7, and L = 7. For both domains, the intermediate tasks have been generated manually using methods from Narvekar et al [2017], by varying the size of the environment, and adding and removing elements (pits and fires in GridWorld, and columns and movable blocks in BlockDude). We intentionally created both tasks that provide positive and negative transfer towards the final task, in order to test the ability of the sequencing algorithm to choose the most appropriate ones.…”
Section: Methodsmentioning
confidence: 99%
“…The automatic generation of curricula [Da Silva and Costa, 2019] has been divided into two sub-problems: task generation [Narvekar et al, 2016;Da Silva and Costa, 2018], that is the problem of creating a set of tasks such that transferring from them is most likely beneficial for the final task; and task sequencing [Svetlik et al, 2017;Narvekar et al, 2017;Da Silva and Costa, 2018;Foglino et al, 2019], whereby previously generated tasks are optimally selected and ordered. Current methods for task sequencing attempt to determine the optimal order of tasks either with [Narvekar et al, 2017;Baranes and Oudeyer, 2013] or without [Svetlik et al, 2017;Da Silva and Costa, 2018] executing the tasks. All task sequencing methods mentioned above are heuristic algorithms tailored to the minimization of time-to-threshold.…”
Section: Related Workmentioning
confidence: 99%
“…Transfer learning is leveraged to transfer information between each pair of tasks in this sequence. Our work builds upon the model proposed by Narvekar et al [16], which formulates curriculum generation as an interaction between two agents acting in two different MDPs. One is a learning agent that is trying to solve a specific target task MDP M t , as is the standard case in reinforcement learning.…”
Section: Curriculum Learningmentioning
confidence: 99%
“…a curriculum) for an agent to train on, such that after training on that sequence, learning speed or performance on a target task is improved. Automatically designing a curriculum is an open problem that has only recently begun to be examined [5,8,9,16,19,23]. One recent approach [16] proposed formulating the selection of tasks using a (meta-level) curriculum Markov Decision Process (MDP).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation