2020
DOI: 10.48550/arxiv.2002.03072
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Generalized Hidden Parameter MDPs Transferable Model-based RL in a Handful of Trials

Abstract: There is broad interest in creating RL agents that can solve many (related) tasks and adapt to new tasks and environments after initial training. Model-based RL leverages learned surrogate models that describe dynamics and rewards of individual tasks, such that planning in a good surrogate can lead to good control of the true system. Rather than solving each task individually from scratch, hierarchical models can exploit the fact that tasks are often related by (unobserved) causal factors of variation in order… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 4 publications
0
1
0
Order By: Relevance
“…This has a very practical benefit as it limits the length of applicable interaction history. Current work, also typically assumes a stationary task state distribution for meta-learning (Doshi-Velez and Konidaris, 2013;Wang et al, 2016;Zintgraf et al, 2018;Rakelly et al, 2019;Zintgraf et al, 2019;Humplik et al, 2019;Fakoor et al, 2019;Perez et al, 2020). However, this framework has also has been readily applicable to more challenging multi-agent learning settings (Da Silva et al, 2006;Amato et al, 2013;Marinescu et al, 2017;Vezhnevets et al, 2019).…”
Section: Context Detectionmentioning
confidence: 99%
“…This has a very practical benefit as it limits the length of applicable interaction history. Current work, also typically assumes a stationary task state distribution for meta-learning (Doshi-Velez and Konidaris, 2013;Wang et al, 2016;Zintgraf et al, 2018;Rakelly et al, 2019;Zintgraf et al, 2019;Humplik et al, 2019;Fakoor et al, 2019;Perez et al, 2020). However, this framework has also has been readily applicable to more challenging multi-agent learning settings (Da Silva et al, 2006;Amato et al, 2013;Marinescu et al, 2017;Vezhnevets et al, 2019).…”
Section: Context Detectionmentioning
confidence: 99%