2021
DOI: 10.1109/tcds.2019.2933371
|View full text |Cite
|
Sign up to set email alerts
|

CLIC: Curriculum Learning and Imitation for Object Control in Nonrewarding Environments

Abstract: In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, and where an apt agent Bob acts independently, with non-observable intentions. We argue that this setting defines a realistic scenario and we present a generic discrete-state discrete-action model of such environments. To learn in this environment, we propose an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 14 publications
(14 citation statements)
references
References 35 publications
(37 reference statements)
0
14
0
Order By: Relevance
“…DisTop simultaneously learns skills, their goal representation, and which skill to train on. It contrasts with several methods that exclusively focus on selecting which skill to train on assuming a good goal representation is available [23,17,24,62,18]. They either select goals according to a curriculum defined with intermediate difficulty and the learning progress [47] or by imagining new language-based goals [18].…”
Section: Related Workmentioning
confidence: 99%
“…DisTop simultaneously learns skills, their goal representation, and which skill to train on. It contrasts with several methods that exclusively focus on selecting which skill to train on assuming a good goal representation is available [23,17,24,62,18]. They either select goals according to a curriculum defined with intermediate difficulty and the learning progress [47] or by imagining new language-based goals [18].…”
Section: Related Workmentioning
confidence: 99%
“…Such techniques are called active imitation learning or interactive learning, and echo the psychological descriptions of infants' selectivity in social partners and its link to their motivation to learn [40,41]. Active imitation learning has been implemented [42] where the agent learns when to imitate using intrinsic motivation for a hierarchical RL problem in a discrete setting. For continuous action, state and goal spaces, the SGIM-ACTS algorithm [38] uses intrinsic motivation to choose not only the kind of demonstrations, but also when to request for demonstrations and who to ask among several teachers.…”
Section: Active Imitation Learning (Social Guidance)mentioning
confidence: 99%
“…Most of such autotelic agents are equipped with one or several goal spaces and rely on goalconditioned RL Colas et al (2020b) and automatic curriculum learning Portelas et al (2020) to learn to achieve those goals along an open-ended developmental trajectory. This endows them with the capability to decide which goals to target and learn about as a function of their current abilities Florensa et al (2018);Fournier et al (2019); Colas et al (2019); Racaniere et al (2019). Thus, by contrast to Interactive RL agents, autotelic agents provide a promising solution to the boundedness issue: if they explore an unbounded set of goals of increasing complexity, they may end up accounting for the open-ended development of children.…”
Section: Autonomous Reinforcement Learnersmentioning
confidence: 99%
“…Closer to our concerns, the autotelic clic agent imitates the behavior of other agents acting in the environment without a pedagogical stance Fournier et al (2019). An interesting feature of clic is that it relies on a curriculum learning mechanism to decide which goal to imitate from these agents depending on its current capabilities.…”
Section: Observational Learningmentioning
confidence: 99%