2020
DOI: 10.48550/arxiv.2012.01244
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

General Characterization of Agents by States they Visit

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(11 citation statements)
references
References 0 publications
0
11
0
Order By: Relevance
“…To compare these policy abstractions in a quantitative view, we demonstrate how the distances of two policies measured by the corresponding policy metrics differ in several Gridworld MDPs. We borrow Distinct Policies, Doorway from [14] and design a new environment, Key Action for simple prototypes of environments with different features; moreover, we increase the stochasticity of the environment for a better evaluation as done in [14]. In particular, E s∼p(s) D(•, •) is calculated by average the absolute differences over all states.…”
Section: Empirical Comparison Of Policy Metrics In Gridworld Mdpsmentioning
confidence: 99%
See 4 more Smart Citations
“…To compare these policy abstractions in a quantitative view, we demonstrate how the distances of two policies measured by the corresponding policy metrics differ in several Gridworld MDPs. We borrow Distinct Policies, Doorway from [14] and design a new environment, Key Action for simple prototypes of environments with different features; moreover, we increase the stochasticity of the environment for a better evaluation as done in [14]. In particular, E s∼p(s) D(•, •) is calculated by average the absolute differences over all states.…”
Section: Empirical Comparison Of Policy Metrics In Gridworld Mdpsmentioning
confidence: 99%
“…3.2 in policy optimization below. To be specific, we consider two policy optimization problem settings: Trust-Region Policy Optimization (TRPO) and Diversity-Guided Evolutionary Strategy (DGES), as introduced in [14], covering both gradient-based and gradient-free policy optimization. Complete details of problem settings are provided in Appendix E.…”
Section: Applying Policy Abstraction To Policy Optimizationmentioning
confidence: 99%
See 3 more Smart Citations