The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence 2019
DOI: 10.24963/ijcai.2019/511
|View full text |Cite
|
Sign up to set email alerts
|

Measuring Structural Similarities in Finite MDPs

Abstract: In this paper, we investigate the structural similarities within a finite Markov decision process (MDP). We view a finite MDP as a heterogeneous directed bipartite graph and propose novel measures for state similarity and action similarity in a mutual reinforcement manner. We prove that the state similarity is a metric and the action similarity is a pseudometric. We also establish the connection between the proposed similarity measures and the optimal values of the MDP. Extensive experiments show that the prop… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(6 citation statements)
references
References 16 publications
0
6
0
Order By: Relevance
“…For MDPs there is much work on measures relating to e.g. homomorphism and abstraction [36,37] and work is starting to emerge to gain more insight in the logical side [31] but their interaction needs study.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…For MDPs there is much work on measures relating to e.g. homomorphism and abstraction [36,37] and work is starting to emerge to gain more insight in the logical side [31] but their interaction needs study.…”
Section: Discussionmentioning
confidence: 99%
“…The Markov property dictates that given the present, the future is independent of the past. To scale to more complex problems, one can exploit structure in the space of state(-action) spaces, or policies or value functions, to utilize abstractions and approximations, for example as value function approximation, state space abstractions [37], and hierarchical decompositions, cf. [36].…”
Section: Introductionmentioning
confidence: 99%
“…Alternative means to quantify the similarity is to use a full specification of MDPs (Song et al, 2016;Wang et al, 2019) or environmental dynamics Yu et al (2019). In contrast, the proposed MULTI-POLAR allows the knowledge transfer only through the policies acquired from source environment instances, which is beneficial when source and target environments are not always connected to exchange information about their environmental dynamics and training samples.…”
Section: Discussion and Related Workmentioning
confidence: 99%
“…Also, while the above methods can more effectively leverage task similarity, there are still a number of limitations and open questions. Though notions of neurogenesis, compositionality, and reconfigurability implicitly rely on task similarity, it is not clear whether and how more explicit measures and representations for task similarity 205 could provide further improvements.…”
Section: Reconfigurable Organismsmentioning
confidence: 99%