Rationalizing Neural Predictions

Leí, Tao; Barzilay, Regina; Jaakkola, Tommi S.

doi:10.18653/v1/d16-1011

Cited by 566 publications

(772 citation statements)

References 26 publications

Supporting

Mentioning

751

Contrasting

Order By: Relevance

“…The key difference between our work and Lei et al (2016)'s method is that our method optimizes for faster inference, and is more dynamic in its jumping. Likewise is the difference between our approach and the "soft" attention approach by (Bahdanau et al, 2014).…”

Section: Related Workmentioning

confidence: 99%

Learning to Skim Text

Yu¹,

Lee²,

Le³

2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers)

103

View full text Add to dashboard Cite

Recurrent Neural Networks are showing much promise in many sub-areas of natural language processing, ranging from document classification to machine translation to automatic question answering. Despite their promise, many recurrent models have to read the whole text word by word, making it slow to handle long documents. For example, it is difficult to use a recurrent network to read a book and answer questions about it. In this paper, we present an approach of reading text while skipping irrelevant information if needed. The underlying model is a recurrent network that learns how far to jump after reading a few words of the input text. We employ a standard policy gradient method to train the model to make discrete jumping decisions. In our benchmarks on four different tasks, including number prediction, sentiment analysis, news article classification and automatic Q&A, our proposed model, a modified LSTM with jumping, is up to 6 times faster than the standard sequential LSTM, while maintaining the same or even better accuracy.

show abstract

Section: Related Workmentioning

confidence: 99%

Learning to Skim Text

Yu¹,

Lee²,

Le³

2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers)

103

View full text Add to dashboard Cite

show abstract

“…The ideal complex neural conversational model should yield improved performances and offer interpretable rationales for answer predictions. Current cutting-edge approach presented in [23], incorporate rationale generation as an integral part of the learning problem. This approach limit the models to extractive rationales by limiting the rationales to be subsets of words from input text that are short and coherent or must alone suffice for prediction as a substitute of the original text.…”

Section: A Sentiment Analysis and Reasoning Network In Neural Convementioning

confidence: 99%

Open-Domain Neural Conversational Agents: The Step Towards Artificial General Intelligence

Arsovski¹,

Wong²,

Cheok³

2018

ijacsa

View full text Add to dashboard Cite

Abstract-Development of conversational agents started half century ago and since then it has transformed into a technology that is accessible in various aspects in everyday life. This paper presents a survey current state-of-the-art in the open domain neural conversational agent research and future research directions towards Artificial General Intelligence (AGI) creation. In order to create a conversational agent which is able to pass the Turing Test, numerous research efforts are focused on open-domain dialogue system. This paper will present latest research in domain of Neural Network reasoning and logical association, sentiment analysis and real-time learning approaches applied to open domain neural conversational agents. As an effort to provide future research directions, current cuttingedge approaches applied to open domain neural conversational agents, current cutting-edge approaches in rationale generation and the state-of-the-art research directions in alternative training methods will be discussed in this paper.

show abstract

“…In many ways, deep learning has become the canonical example of the "black box" of machine learning and many of the approaches to explaining it can be loosely categorized into two types: approaches that try to interpret the parameters themselves (e.g., with visualizations and heat maps (Zeiler and Fergus, 2014;Hermann et al, 2015;Li et al, 2016), and approaches that generate humaninterpretable information that is ideally correlated with what is being learned inside the model (e.g., Lei et al (2016) Deep learning has been successfully applied to many recent QA approaches and related tasks (Bordes et al, 2015;Hermann et al, 2015;He and Golub, 2016;Dong et al, 2015;Tan et al, 2016, inter alia). However, large quantities of data are needed to train the millions of parameters often contained in these models.…”

Section: Related Workmentioning

confidence: 99%

“…One approach to interpreting complex models is to make use of human-interpretable information generated by the model to gain insight into what the model is learning. We follow the intuition of Lei et al (2016), whose two-component network first generates text spans from an input document, and then uses these text spans to make predictions. Lei et al utilize these intermediate text spans to infer the model's preferences.…”

Section: Introductionmentioning

confidence: 99%

Tell Me Why: Using Question Answering as Distant Supervision for Answer Justification

Sharp

Surdeanu

Jansen

et al. 2017

Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017)

View full text Add to dashboard Cite

For many applications of question answering (QA), being able to explain why a given model chose an answer is critical. However, the lack of labeled data for answer justifications makes learning this difficult and expensive. Here we propose an approach that uses answer ranking as distant supervision for learning how to select informative justifications, where justifications serve as inferential connections between the question and the correct answer while often containing little lexical overlap with either. We propose a neural network architecture for QA that reranks answer justifications as an intermediate (and human-interpretable) step in answer selection. Our approach is informed by a set of features designed to combine both learned representations and explicit features to capture the connection between questions, answers, and answer justifications. We show that with this end-to-end approach we are able to significantly improve upon a strong IR baseline in both justification ranking (+9% rated highly relevant) and answer selection (+6% P@1).

show abstract

Rationalizing Neural Predictions

Cited by 566 publications

References 26 publications

Learning to Skim Text

Learning to Skim Text

Open-Domain Neural Conversational Agents: The Step Towards Artificial General Intelligence

Tell Me Why: Using Question Answering as Distant Supervision for Answer Justification

Contact Info

Product

Resources

About