2021
DOI: 10.48550/arxiv.2102.02926
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Alchemy: A benchmark and analysis toolkit for meta-reinforcement learning agents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 0 publications
0
6
0
Order By: Relevance
“…For example, Tosch et al [83] presents three highly parametrisable versions of Atari games, and uses them to perform post hoc analysis of agents trained on a single variant. Some environments are not targeted at zero-shot policy transfer (CausalWorld, RWRL, RLBench, Alchemy, Meta-world [51,81,78,45,68]), but could be adapted to such a scenario with a different evaluation protocol. More generally, all environments provide a context set, and many then propose specific evaluation protocols, but other protocols could be used as long as they were well-justified.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, Tosch et al [83] presents three highly parametrisable versions of Atari games, and uses them to perform post hoc analysis of agents trained on a single variant. Some environments are not targeted at zero-shot policy transfer (CausalWorld, RWRL, RLBench, Alchemy, Meta-world [51,81,78,45,68]), but could be adapted to such a scenario with a different evaluation protocol. More generally, all environments provide a context set, and many then propose specific evaluation protocols, but other protocols could be used as long as they were well-justified.…”
Section: Discussionmentioning
confidence: 99%
“…Protocol C is commonly used among PCG environments that are not explicitly targeted at generalisation (MiniGrid, NLE, MiniHack, Alchemy [70,73,71,45]). The testing context set consists of seeds held out from the training set, and otherwise during training the full context set is used.…”
Section: Evaluation Protocols For Generalisationmentioning
confidence: 99%
“…However, a more robust benchmark could include the aforementioned change points in order to further control the complexity. The CT-graph, Meta-world, and the recently developed Alchemy (Wang et al, 2021) environment are examples of benchmarks with early stage work in this direction, albeit implicitly. Therefore, the development of a precise measure of task similarity and complexity, as well as robust benchmarks with configurable change points (i.e., reward, state/input, and transition) would be highly beneficial to the meta-RL field.…”
Section: Discussionmentioning
confidence: 99%
“…Popular crafting games, such as Minecraft and Little Alchemy, have inspired research on autonomous exploration in people (Brändle et al, 2023) and artificial agents (G. Wang et al, 2023). Crafting games are also widely used for designing benchmarks for human-like generalization and reasoning (Hafner, 2022;J. X. Wang et al, 2021).…”
Section: Crafting Gamesmentioning
confidence: 99%