2019
DOI: 10.1609/aaai.v33i01.33013582
|View full text |Cite
|
Sign up to set email alerts
|

Combined Reinforcement Learning via Abstract Representations

Abstract: In the quest for efficient and robust reinforcement learning methods, both model-free and model-based approaches offer advantages. In this paper we propose a new way of explicitly bridging both approaches via a shared low-dimensional learned encoding of the environment, meant to capture summarizing abstractions. We show that the modularity brought by this approach leads to good generalization while being computationally efficient, with planning happening in a smaller latent state space. In addition, this appro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
61
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 53 publications
(63 citation statements)
references
References 2 publications
2
61
0
Order By: Relevance
“…The CRAR agent explicitly learns both a value function and a model via a shared low-dimensional learned encoding of the environment, which is meant to capture summarized abstractions and allow for efficient planning (François-Lavet et al, 2018). By forcing an expressive representation, the CRAR approach creates an interpretable low-dimensional representation of the environment, even far temporally from any rewards or in the absence of model-free objectives.…”
Section: Value-based Rlmentioning
confidence: 99%
See 2 more Smart Citations
“…The CRAR agent explicitly learns both a value function and a model via a shared low-dimensional learned encoding of the environment, which is meant to capture summarized abstractions and allow for efficient planning (François-Lavet et al, 2018). By forcing an expressive representation, the CRAR approach creates an interpretable low-dimensional representation of the environment, even far temporally from any rewards or in the absence of model-free objectives.…”
Section: Value-based Rlmentioning
confidence: 99%
“…In deep RL, it is possible to build an abstract state such that it provides sufficient information for simultaneously fitting an internal meaningful dynamics as well as the estimation of the expected value of an optimal policy. By explicitly learning both the model-free and model-based components through the state representation, along with an approximate entropy maximization penalty, the CRAR agent (François-Lavet et al, 2018) shows how it is possible to learn a low-dimensional representation of the task. In addition, this approach can directly make use of a combination of model-free and model-based, with planning happening in a smaller latent state space.…”
Section: Auxiliary Tasksmentioning
confidence: 99%
See 1 more Smart Citation
“…The recently introduced Consciousness Prior (CP; Bengio, 2017) is a framework to represent the mental model of a single agent, through the notion of abstract state representations. 6 Here, an abstract state corresponds withs ( §4), a low-dimensional, structured, interpretable state encoding, useful for planning, communication, and predicting upcoming observations (François-Lavet et al, 2019). One example is a dynamic knowledge graph embedding to represent a scene (Kipf et al, 2020).…”
Section: Formal Frameworkmentioning
confidence: 99%
“…FEWS integrated management requires a combination of economic-based management strategy evaluation, with optimization that incorporates environmental impacts and risk of climate change. Decision theory and reinforcement learning make this integration possible; recent advancements in these fields have shown great promise in modeling complex dynamics of interdependent systems (Littman 2015) in many real-world applications such as human-level control in gaming (Mnih et al 2015, Silver et al 2017, natural resource management Boettiger 2018, 2019), and robotics (Porta et al 2005, Francois-Lavet et al 2018. In this article we develop a dynamic optimization approach basing upon fundamentals of decision theory and model-based reinforcement learning, to adaptively control and optimize operation of integrated FEWS.…”
Section: Introductionmentioning
confidence: 99%