Autotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: A Short Survey

Colas, Cédric; Karch, Tristan; Sigaud, Olivier; Oudeyer, Pierre-Yves

doi:10.1613/jair.1.13554

Cited by 55 publications

(87 citation statements)

References 68 publications

(93 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…control instructions and act toward various goals (Veeriah et al, 2018). Based on universal value function approximators (UVFAs) (Schaul et al, 2015), goal-conditioned RL (GCRL) (Colas et al, 2022) is proposed to accomplish these tasks by leveraging the goalconditioned value network and policy network. The RL agent is optimized by goal-labeled trajectories with goal-specific rewards.…”

Section: Figurementioning

confidence: 99%

Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration

Han

Peng²,

Liu³

et al. 2023

Front. Neurorobot.

View full text Add to dashboard Cite

Reinforcement learning (RL) empowers the agent to learn robotic manipulation skills autonomously. Compared with traditional single-goal RL, semantic-goal-conditioned RL expands the agent capacity to accomplish multiple semantic manipulation instructions. However, due to sparsely distributed semantic goals and sparse-reward agent-environment interactions, the hard exploration problem arises and impedes the agent training process. In traditional RL, curiosity-motivated exploration shows effectiveness in solving the hard exploration problem. However, in semantic-goal-conditioned RL, the performance of previous curiosity-motivated methods deteriorates, which we propose is because of their two defects: uncontrollability and distraction. To solve these defects, we propose a conservative curiosity-motivated method named mutual information motivation with hybrid policy mechanism (MIHM). MIHM mainly contributes two innovations: the decoupled-mutual-information-based intrinsic motivation, which prevents the agent from being motivated to explore dangerous states by uncontrollable curiosity; the precisely trained and automatically switched hybrid policy mechanism, which eliminates the distraction from the curiosity-motivated policy and achieves the optimal utilization of exploration and exploitation. Compared with four state-of-the-art curiosity-motivated methods in the sparse-reward robotic manipulation task with 35 valid semantic goals, including stacks of 2 or 3 objects and pyramids, our MIHM shows the fastest learning speed. Moreover, MIHM achieves the highest 0.9 total success rate, which is up to 0.6 in other methods. Throughout all the baseline methods, our MIHM is the only one that achieves to stack three objects.

show abstract

Section: Figurementioning

confidence: 99%

Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration

Han

Peng²,

Liu³

et al. 2023

Front. Neurorobot.

View full text Add to dashboard Cite

show abstract

“…In reinforcement learning (RL), the reward function specifies the external, environmentally-defined signal the agent attempts to maximize: formally, it maps each state and each action to the scalar reward the agent experiences for taking an action at a particular state. Agent-generated goals are used to supplement the environment's reward function and provide an alternative signal that can guide exploration (see Colas et al, 2020b; we will omit discussion of other exploration approaches, see Weng, 2020 for a review). Most current approaches generate goals that can be evaluated on only a single world state, and as such, fall far short of the richness of real-world goals (Colas et al, 2020b, section 7.1).…”

Section: Related Workmentioning

confidence: 99%

Creativity, Compositionality, and Common Sense in Human Goal Generation

Davidson¹,

Gureckis²,

Lake³

2022

Preprint

View full text Add to dashboard Cite

Inspired by notions of intrinsic motivation (Schmidhuber, 2010) and play as proposing and solving arbitrary problems (Chu & Schulz, 2020) we report initial progress toward computational modeling of playful goal generation. We create an embodied, 3D environment resembling a child's bedroom, and ask study participants to play in the environment and then create a scorable game. We propose to model games using a domain-specific language, which represents each game as a computer program. These programs act as reward-generating functions, mapping states visited by an agent as they play a game to the score they should receive. We then analyze our corpus of program representations to highlight four key aspects of human games that would contribute to constructing effective computational models of game generation: creativity, compositionality, common sense, and context sensitivity.

show abstract

“…In the larger context, acquisition is an important part of simulating the development of intelligent behavior as well. So-called autotelic agents serve as a meta-approach to acquisition by modeling open-ended adaptive behaviors [19]. Developmental connectionist models enable the developmental trajectories and transitions, critical periods, and the process of learning [20,21].…”

Section: Simulation Of Interconnected Developmental Systemsmentioning

confidence: 99%

Embodied Cognitive Morphogenesis as a Route to Intelligent Systems

Alicea¹,

Gordon²,

Parent³

2022

Preprint

View full text Add to dashboard Cite

The embryological view of biological development is that coordinated gene expression along with cellular physics and migration provides the basis for pattern formation and phenotypic complexity. This stands in contrast to the prevailing view of embodied cognition, which claims that informational feedback between organisms and their environment is key to the emergence of intelligent behaviors, particularly in development. We aim to unite these two perspectives, and call this approach embodied cognitive morphogenesis. Embodied cognitive morphogenesis leads to both fluctuating phenotypic asymmetry and the emergence of information processing subsystems. Our guiding question involves the role morphogenetic symmetry-breaking plays in the differentiation of specialized organismal subsystems, and how this serves as a substrate for the emergence of autonomous behaviors. As the process of embodied cognitive morphogenesis unfolds, we observe three distinct properties: acquisition, generativity, and transformation. A related issue involves the quantitative models we might use to evaluate the increasingly complex structure of embodied cognitive morphogenetic agents. These include tensegrity modeling, differentiation trees, and embodied hypernetworks. These three types of models provide a means to identify the context of various symmetry-breaking events in developmental time. Related concepts that help us define this phenotype further will also be discussed. These include concepts such as modularity, homeostasis, and 4E (embodied, enactive, embedded, and extended) cognition. In conclusion, we will consider these autonomous developmental systems as a process called connectogenesis, or the process of connecting various parts of the emerged phenotype. This approach is useful for the analysis of organisms and the design of bio-inspired computational agents.

show abstract

Autotelic Agents with Intrinsically Motivated Goal-Conditioned Reinforcement Learning: A Short Survey

Cited by 55 publications

References 68 publications

Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration

Learning robotic manipulation skills with multiple semantic goals by conservative curiosity-motivated exploration

Creativity, Compositionality, and Common Sense in Human Goal Generation

Embodied Cognitive Morphogenesis as a Route to Intelligent Systems

Contact Info

Product

Resources

About