Autonomous Task Sequencing for Customized Curriculum Design in Reinforcement Learning

Narvekar, Sanmit; Sinapov, Jivko; Stone, Peter

doi:10.24963/ijcai.2017/353

Cited by 77 publications

(93 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For GridWorld, Experiment 3 ( Figure 3) has parameters n := |T | = 12 and L = 4, while in Experiment 4 n = 7, and L = 7. For both domains, the intermediate tasks have been generated manually using methods from Narvekar et al [2017], by varying the size of the environment, and adding and removing elements (pits and fires in GridWorld, and columns and movable blocks in BlockDude). We intentionally created both tasks that provide positive and negative transfer towards the final task, in order to test the ability of the sequencing algorithm to choose the most appropriate ones.…”

Section: Methodsmentioning

confidence: 99%

“…The automatic generation of curricula [Da Silva and Costa, 2019] has been divided into two sub-problems: task generation [Narvekar et al, 2016;Da Silva and Costa, 2018], that is the problem of creating a set of tasks such that transferring from them is most likely beneficial for the final task; and task sequencing [Svetlik et al, 2017;Narvekar et al, 2017;Da Silva and Costa, 2018;Foglino et al, 2019], whereby previously generated tasks are optimally selected and ordered. Current methods for task sequencing attempt to determine the optimal order of tasks either with [Narvekar et al, 2017;Baranes and Oudeyer, 2013] or without [Svetlik et al, 2017;Da Silva and Costa, 2018] executing the tasks. All task sequencing methods mentioned above are heuristic algorithms tailored to the minimization of time-to-threshold.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Curriculum Learning for Cumulative Return Maximization

Foglino

Christakou

Gutierrez

et al. 2019

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Curriculum learning has been successfully used in reinforcement learning to accelerate the learning process, through knowledge transfer between tasks of increasing complexity. Critical tasks, in which suboptimal exploratory actions must be minimized, can benefit from curriculum learning, and its ability to shape exploration through transfer. We propose a task sequencing algorithm maximizing the cumulative return, that is, the return obtained by the agent across all the learning episodes. By maximizing the cumulative return, the agent not only aims at achieving high rewards as fast as possible, but also at doing so while limiting suboptimal actions. We experimentally compare our task sequencing algorithm to several popular metaheuristic algorithms for combinatorial optimization, and show that it achieves significantly better performance on the problem of cumulative return maximization. Furthermore, we validate our algorithm on a critical task, optimizing a home controller for a micro energy grid.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Curriculum Learning for Cumulative Return Maximization

Foglino

Christakou

Gutierrez

et al. 2019

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

show abstract

“…Transfer learning is leveraged to transfer information between each pair of tasks in this sequence. Our work builds upon the model proposed by Narvekar et al [16], which formulates curriculum generation as an interaction between two agents acting in two different MDPs. One is a learning agent that is trying to solve a specific target task MDP M t , as is the standard case in reinforcement learning.…”

Section: Curriculum Learningmentioning

confidence: 99%

“…a curriculum) for an agent to train on, such that after training on that sequence, learning speed or performance on a target task is improved. Automatically designing a curriculum is an open problem that has only recently begun to be examined [5,8,9,16,19,23]. One recent approach [16] proposed formulating the selection of tasks using a (meta-level) curriculum Markov Decision Process (MDP).…”

Section: Introductionmentioning

confidence: 99%

“…Automatically designing a curriculum is an open problem that has only recently begun to be examined [5,8,9,16,19,23]. One recent approach [16] proposed formulating the selection of tasks using a (meta-level) curriculum Markov Decision Process (MDP). A policy over this MDP, called a curriculum policy, maps from the current knowledge of an RL agent to the task it should learn next.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Curriculum Learning in Reinforcement Learning

Narvekar

2017

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence

Self Cite

View full text Add to dashboard Cite

Transfer learning in reinforcement learning is an area of research that seeks to speed up or improve learning of a complex target task, by leveraging knowledge from one or more source tasks. This thesis will extend the concept of transfer learning to curriculum learning, where the goal is to design a sequence of source tasks for an agent to train on, such that final performance or learning speed is improved. We discuss completed work on this topic, including methods for semi-automatically generating source tasks tailored to an agent and the characteristics of a target domain, and automatically sequencing such tasks into a curriculum. Finally, we also present ideas for future work.

show abstract

Joint Situational Assessment‐Hierarchical Decision‐Making Framework for Maneuver Intent Decisions

Chen,

Li,

Yan

et al. 2024

Advanced Intelligent Systems

View full text Add to dashboard Cite

Decision‐making in unmanned combat aerial vehicles (UCAVs) presents a multifaceted challenge because of the complexity and dynamics of the flight environment, which leads to hurdles in training convergence, low decision validity, and the dimensionality catastrophe for decision‐making neural networks. A novel framework is proposed to address breaking down the complicated decision issues, which combines the strengths of graph convolutional networks in relation extraction with the ability of hierarchical reinforcement learning. To solve the problem of decision validity under high‐dimensional inputs, the joint framework is applied to the Maneuver Intent's decision, and a maneuver library‐based state space design method is suggested. The joint framework executes adaptable strategies and flight maneuvers to address the issue of training non‐convergence or task failure due to difficult‐to‐obtain reward signals across various scenarios. Then, the recurrent curriculum training and cross‐entropy rewards are designed to train decisions on different sub‐strategies. The experimental evaluation demonstrated more flexibility and adaptability in decision‐making problems under complex tasks compared to rule‐based and reinforcement learning baseline methods. The method proposed in this article provides a novel approach to resolving intricate decision problems, and which has certain theoretical significance and reference value for engineering applications.

show abstract

Autonomous Task Sequencing for Customized Curriculum Design in Reinforcement Learning

Cited by 77 publications

References 10 publications

Curriculum Learning for Cumulative Return Maximization

Curriculum Learning for Cumulative Return Maximization

Curriculum Learning in Reinforcement Learning

Joint Situational Assessment‐Hierarchical Decision‐Making Framework for Maneuver Intent Decisions

Contact Info

Product

Resources

About