Playing 20 Question Game with Policy-Based Reinforcement Learning

Hu, Huang; Wu, X.; Luo, Bingfeng; Tao, Chongyang; Xu, Can; Wu, Wei; Chen, Zhan

doi:10.18653/v1/d18-1361

Cited by 24 publications

(23 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There are several mainstream methods in the DRL framework including Deep Q-Network (Mnih et al 2015) and Policy Networks (Silver et al 2016). Besides, DRL is widely used in many NLP tasks (Wu, Li, and Wang 2018;Feng et al 2018;Li et al 2019;Narasimhan, Kulkarni, and Barzilay 2015;He et al 2015;Hu et al 2018a). These works prove the rationality and effectiveness of applying DRL to NLP tasks, which support our work on LJP.…”

Section: Related Work Deep Reinforcement Learningmentioning

confidence: 99%

Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction

Zhong

Yu-zhong

et al. 2020

AAAI

View full text Add to dashboard Cite

Legal Judgment Prediction (LJP) aims to predict judgment results according to the facts of cases. In recent years, LJP has drawn increasing attention rapidly from both academia and the legal industry, as it can provide references for legal practitioners and is expected to promote judicial justice. However, the research to date usually suffers from the lack of interpretability, which may lead to ethical issues like inconsistent judgments or gender bias. In this paper, we present QAjudge, a model based on reinforcement learning to visualize the prediction process and give interpretable judgments. QAjudge follows two essential principles in legal systems across the world: Presumption of Innocence and Elemental Trial. During inference, a Question Net will select questions from the given set and an Answer Net will answer the question according to the fact description. Finally, a Predict Net will produce judgment results based on the answers. Reward functions are designed to minimize the number of questions asked. We conduct extensive experiments on several real-world datasets. Experimental results show that QAjudge can provide interpretable judgments while maintaining comparable performance with other state-of-the-art LJP models. The codes can be found from https://github.com/thunlp/QAjudge.

show abstract

Section: Related Work Deep Reinforcement Learningmentioning

confidence: 99%

Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction

Zhong

Yu-zhong

et al. 2020

AAAI

View full text Add to dashboard Cite

show abstract

“…Language-based interaction has been studied in the context of visual question answering (de Vries et al, 2017;Chattopadhyay et al, 2017;Lee et al, 2019;Shukla et al, 2019), SQL generation (Gur et al, 2018;Yao et al, 2019), information retrieval (Chung et al, 2018;Aliannejadi et al, 2019) and multi-turn textbased question answering (Rao and Daumé III, 2018;Reddy et al, 2019;Choi et al, 2018). Most methods require learning from recorded dialogues Hu et al, 2018;Rao and Daumé III, 2018) or conducting Wizard-of-Oz dialog annotations (Kelley, 1984;Wen et al, 2017). Instead, we limit the interaction to multiple-choice and binary questions.…”

Section: Related Workmentioning

confidence: 99%

“…This simplification allows us to reduce the complexity of data annotation while still achieving effective interaction. Our task can be viewed as an instance of the popular 20-question game (20Q), which has been applied to a celebrities knowledge base Hu et al, 2018). Our approach differs in using natural language descriptions of classification targets, questions and answers to compute our distributions, instead of treating them as categorical or structural data.…”

Section: Related Workmentioning

confidence: 99%

Interactive Classification by Asking Informative Questions

Yu¹,

Chen²,

Wang³

et al. 2020

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

We study the potential for interaction in natural language classification. We add a limited form of interaction for intent classification, where users provide an initial query using natural language, and the system asks for additional information using binary or multichoice questions. At each turn, our system decides between asking the most informative question or making the final classification prediction.The simplicity of the model allows for bootstrapping of the system without interaction data, instead relying on simple crowdsourcing tasks. We evaluate our approach on two domains, showing the benefit of interaction and the advantage of learning to balance between asking additional questions and making the final prediction.

show abstract

“…For different purposes, there are various question generation tasks. Hu et al (2018) aim to ask questions to play the 20 question game. Dhingra et al (2017) teach models to ask questions to limit the number of answer candidates in task-oriented dialogues.…”

Section: Question Generationmentioning

confidence: 99%

Asking Clarification Questions in Knowledge-Based Question Answering

Xu¹,

Wang²,

Tang³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

The ability to ask clarification questions is essential for knowledge-based question answering (KBQA) systems, especially for handling ambiguous phenomena. Despite its importance, clarification has not been well explored in current KBQA systems. Further progress requires supervised resources for training and evaluation, and powerful models for clarification-related text understanding and generation. In this paper, we construct a new clarification dataset, CLAQUA, with nearly 40K open-domain examples. The dataset supports three serial tasks: given a question, identify whether clarification is needed; if yes, generate a clarification question; then predict answers base on external user feedback. We provide representative baselines for these tasks and further introduce a coarse-to-fine model for clarification question generation. Experiments show that the proposed model achieves better performance than strong baselines. The further analysis demonstrates that our dataset brings new challenges and there still remain several unsolved problems, like reasonable automatic evaluation metrics for clarification question generation and powerful models for handling entity sparsity. 1 * The work was done while Jingjing Xu and Yuechen Wang were interns in Microsoft Research, Asia. 1 The dataset and code will be released at https:// github.com/msra-nlc/MSParS_V2.0 What are the languages used to create the source code of Midori? I mean the first one. When you say the source code language used in the program Midori, are you referring to web browser Midori or operating system Midori? C.

show abstract

Playing 20 Question Game with Policy-Based Reinforcement Learning

Cited by 24 publications

References 8 publications

Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction

Iteratively Questioning and Answering for Interpretable Legal Judgment Prediction

Interactive Classification by Asking Informative Questions

Asking Clarification Questions in Knowledge-Based Question Answering

Contact Info

Product

Resources

About