“…Narasimhan et al (2015) demonstrate that "Deep-Q Networks" (DQN) (Mnih et al, 2015) et al (2020) show that the Go-Explore algorithm (Ecoffet et al, 2019), which periodically returns to promising but underexplored areas of a world, can achieve higher scores than the DRRN with fewer steps. (Fulda et al, 2017a) 0.59 0.03 0.00 0.10 0.00 0.01 Golovin (Kostka et al, 2017) 0.20 0.04 0.10 0.15 0.00 0.01 AE-DQN (Zahavy et al, 2018) -0.05 ----NeuroAgent (Rajalingam and Samothrakis, 2019) 0.19 0.03 0.00 0.20 0.00 0.00 NAIL (Hausknecht et al, 2019) 0.38 0.03 0.26 -0.00 0.00 CNN-DQN (Yin and May, 2019a) -0.11 ----IK-OMP (Tessler et al, 2019) -1.00 ----TDQN 0.47 0.03 0.00 0.34 0.02 0.00 KG-A2C 0.58 0.10 0.01 0.06 0.03 0.01 SC (Jain et al, 2020) -0.10 --0.0 -CALM (N-gram) (Yao et al, 2020) 0.79 0.07 0.00 0.09 0.00 0.00 CALM (GPT-2) (Yao et al, 2020) 0.80 0.09 0.07 0.14 0.05 0.01 RC-DQN (Guo et al, 2020a) 0.81 0.11 0.40 0.20 0.05 0.02 MPRC-DQN (Guo et al, 2020a) 0.88 0.11 0.52 0.20 0.05 0.02 SHA-KG (Xu et al, 2020) 0.86 0.10 0.10 -0.05 0.02 MC!Q*BERT (Ammanabrolu et al, 2020b) 0.92 0.12 --0.00 -INV-DY (Yao et al, 2021) 0.81 0.12 0.06 0.11 0.05 - To support these modelling paradigms, Zelinka et al (2019) introduce TextWorld KG, a dataset for learning the subtask of updating knowledge graphs based on text world descriptions in a cooking domain, and show their best ensemble model is able to achieve 70 F1 at this subtask. Similarly, Annamabrolu et al (2021a) introduce JerichoWorld, a similar dataset for world modeling using knowledge graphs but on a broader set of interactive fiction games, and subsequently introduce World-Former (Ammanabrolu and Riedl, 2021b), a multitask transformer model that performs well at both knowledge-graph prediction and next-action prediction tasks.…”