Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.241
|View full text |Cite
|
Sign up to set email alerts
|

Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based Games

Abstract: We show that Reinforcement Learning (RL) methods for solving Text-Based Games (TBGs) often fail to generalize on unseen games, especially in small data regimes. To address this issue, we propose Context Relevant Episodic State Truncation (CREST) for irrelevant token removal in observation text for improved generalization. Our method first trains a base model using Q-learning, which typically overfits the training games. The base model's action token distribution is used to perform observation pruning that remo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
10
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
2

Relationship

5
2

Authors

Journals

citations
Cited by 9 publications
(10 citation statements)
references
References 18 publications
0
10
0
Order By: Relevance
“…LeDeepChef (Adolphs and Hofmann, 2020) used recurrent feature extraction along with the A2C (Mnih et al, 2016). CREST (Chaudhury et al, 2020) was proposed for pruning observation information. TWC (Murugesan et al, 2021) was proposed for utilizing common sense reasoning.…”
Section: Related Workmentioning
confidence: 99%
“…LeDeepChef (Adolphs and Hofmann, 2020) used recurrent feature extraction along with the A2C (Mnih et al, 2016). CREST (Chaudhury et al, 2020) was proposed for pruning observation information. TWC (Murugesan et al, 2021) was proposed for utilizing common sense reasoning.…”
Section: Related Workmentioning
confidence: 99%
“…The field of text-based and interactive games has seen a lot of recent interest and work, thanks in large part to the creation and availability of pioneering environments such as TextWorld [11] and the Jericho [17] collection. Based on these domains, several interesting approaches have been proposed that seek to improve the efficiency of agents in these environments [4,12,9,22]. We mention and discuss this prior work in context in the earlier parts of this paper.…”
Section: Related Workmentioning
confidence: 99%
“…Specifically, we consider RL agents in the TextWorld and Jericho TBG environments; and additional information that can be provided to such agents to improve their performance. Past work has focused on trying to use external knowledge to either limit [9] or enhance [22] the space of actions: however, this has also been restricted to the text modality. At their crux, these efforts are all trying fundamentally to solve the problem of relationships within the environment -how are different things in the world related to each other?…”
Section: Introductionmentioning
confidence: 99%
“…Under certain controls necessary for studying RL, text-based games provide complex, interactive, and a variety of simulated environments where the environmental game state observation * denotes equal contribution is obtained through the text description, and the agent is expected to make progress by entering text commands. In addition to language understanding (Ammanabrolu and Riedl, 2019;Adhikari et al, 2020), successful play requires skills such as long-term memory (Narasimhan et al, 2015), exploration , observation pruning (Chaudhury et al, 2020), and common sense reasoning (Keerthiram Murugesan and Campbell, 2021). However, these studies are not using the neuro-symbolic approach which is a combination of the neural network and the symbolic framework.…”
Section: Introductionmentioning
confidence: 99%