Measuring Structural Similarities in Finite MDPs

Wang, Hao; Dong, Shaokang; Shao, Ling

doi:10.24963/ijcai.2019/511

Cited by 15 publications

(6 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For MDPs there is much work on measures relating to e.g. homomorphism and abstraction [36,37] and work is starting to emerge to gain more insight in the logical side [31] but their interaction needs study.…”

Section: Discussionmentioning

confidence: 99%

“…The Markov property dictates that given the present, the future is independent of the past. To scale to more complex problems, one can exploit structure in the space of state(-action) spaces, or policies or value functions, to utilize abstractions and approximations, for example as value function approximation, state space abstractions [37], and hierarchical decompositions, cf. [36].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Regular Decision Processes for Grid Worlds

Lenaers¹,

Otterlo²

2021

Preprint

View full text Add to dashboard Cite

Markov decision processes are typically used for sequential decision making under uncertainty. For many aspects however, ranging from constrained or safe specifications to various kinds of temporal (non-Markovian) dependencies in task and reward structures, extensions are needed. To that end, in recent years interest has grown into combinations of reinforcement learning and temporal logic, that is, combinations of flexible behavior learning methods with robust verification and guarantees. In this paper we describe an experimental investigation of the recently introduced regular decision processes that support both non-Markovian reward functions as well as transition functions. In particular, we provide a tool chain for regular decision processes, algorithmic extensions relating to online, incremental learning, an empirical evaluation of model-free and model-based solution algorithms, and applications in regular, but non-Markovian, grid worlds.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Regular Decision Processes for Grid Worlds

Lenaers¹,

Otterlo²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Alternative means to quantify the similarity is to use a full specification of MDPs (Song et al, 2016;Wang et al, 2019) or environmental dynamics Yu et al (2019). In contrast, the proposed MULTI-POLAR allows the knowledge transfer only through the policies acquired from source environment instances, which is beneficial when source and target environments are not always connected to exchange information about their environmental dynamics and training samples.…”

Section: Discussion and Related Workmentioning

confidence: 99%

MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics

Barekatain,

Yonetani,

Hamaya

2019

Preprint

View full text Add to dashboard Cite

Transfer reinforcement learning (RL) aims at improving learning efficiency of an agent by exploiting knowledge from other source agents trained on relevant tasks. However, it remains challenging to transfer knowledge between different environmental dynamics without having access to the source environments. In this work, we explore a new challenge in transfer RL, where only a set of source policies collected under unknown diverse dynamics is available for learning a target task efficiently. To address this problem, the proposed approach, MULTI-source POLicy AggRegation (MULTIPOLAR), comprises two key techniques. We learn to aggregate the actions provided by the source policies adaptively to maximize the target task performance. Meanwhile, we learn an auxiliary network that predicts residuals around the aggregated actions, which ensures the target policy's expressiveness even when some of the source policies perform poorly. We demonstrated the effectiveness of MULTIPOLAR through an extensive experimental evaluation across six simulated environments ranging from classic control problems to challenging robotics simulations, under both continuous and discrete action spaces.

show abstract

“…Also, while the above methods can more effectively leverage task similarity, there are still a number of limitations and open questions. Though notions of neurogenesis, compositionality, and reconfigurability implicitly rely on task similarity, it is not clear whether and how more explicit measures and representations for task similarity 205 could provide further improvements.…”

Section: Reconfigurable Organismsmentioning

confidence: 99%

Biological underpinnings for lifelong learning machines

Aguilar-Simon²,

et al. 2022

View full text Add to dashboard Cite

Biological organisms learn from interactions with their environment throughout their lifetime. For artificial systems to successfully act and adapt in the real world, it is desirable to similarly be able to learn on a continual basis. This challenge is known as lifelong learning, and remains to a large extent unsolved. In this perspective article, we identify a set of key capabilities that artificial systems will need to achieve lifelong learning. We describe a number of biological mechanisms, both neuronal and non-neuronal, that help explain how organisms solve these challenges, and present examples of biologically inspired models and biologically plausible mechanisms that have been applied to artificial intelligence systems in the quest towards development of lifelong learning machines. We discuss opportunities to further our understanding and advance the state of the art in lifelong learning, aiming to bridge the gap between natural and artificial intelligence.

show abstract

Measuring Structural Similarities in Finite MDPs

Cited by 15 publications

References 16 publications

Regular Decision Processes for Grid Worlds

Regular Decision Processes for Grid Worlds

MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics

Biological underpinnings for lifelong learning machines

Contact Info

Product

Resources

About