State-of-the-art models in NLP are now predominantly based on deep neural networks that are opaque in terms of how they come to make predictions. This limitation has increased interest in designing more interpretable deep models for NLP that reveal the 'reasoning' behind model outputs. But work in this direction has been conducted on different datasets and tasks with correspondingly unique aims and metrics; this makes it difficult to track progress. We propose the Evaluating Rationales And Simple English Reasoning (ERASER ) benchmark to advance research on interpretable models in NLP. This benchmark comprises multiple datasets and tasks for which human annotations of "rationales" (supporting evidence) have been collected. We propose several metrics that aim to capture how well the rationales provided by models align with human rationales, and also how faithful these rationales are (i.e., the degree to which provided rationales influenced the corresponding predictions). Our hope is that releasing this benchmark facilitates progress on designing more interpretable NLP systems. The benchmark, code, and documentation are available at https://www.eraserbenchmark.com/ Commonsense Explanations (CoS-E)Where do you find the most amount of leafs? (a) Compost pile (b) Flowers (c) Forest (d) Field (e) Ground Movie ReviewsIn this movie, … Plots to take over the world. The acting is great! The soundtrack is run-of-the-mill, but the action more than makes up for it (a) Positive (b) Negative Evidence InferenceArticle Patients for this trial were recruited … Compared with 0.9% saline, 120 mg of inhaled nebulized furosemide had no effect on breathlessness during exercise. (a) Sig. decreased (b) No sig. difference (c) Sig. increased Prompt With respect to breathlessness, what is the reported difference between patients receiving placebo and those receiving furosemide? e-SNLI H A man in an orange vest leans over a pickup truck P A man is touching a truck (a) Entailment (b) Contradiction (c) Neutral
Transformer architectures have proven to learn useful representations for protein classification and generation tasks. However, these representations present challenges in interpretability. Through the lens of attention, we analyze the inner workings of the Transformer and explore how the model discerns structural and functional properties of proteins. We show that attention (1) captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure, (2) targets binding sites, a key functional component of proteins, and (3) focuses on progressively more complex biophysical properties with increasing layer depth. We also present a three-dimensional visualization of the interaction between attention and protein structure. Our findings align with known biological processes and provide a tool to aid discovery in protein engineering and synthetic biology. The code for visualization and analysis is available at https://github.com/salesforce/provis.
Deep learning models perform poorly on tasks that require commonsense reasoning, which often necessitates some form of worldknowledge or reasoning over information not immediately present in the input. We collect human explanations for commonsense reasoning in the form of natural language sequences and highlighted annotations in a new dataset called Common Sense Explanations (CoS-E). We use CoS-E to train language models to automatically generate explanations that can be used during training and inference in a novel Commonsense Auto-Generated Explanation (CAGE) framework. CAGE improves the state-of-the-art by 10% on the challenging CommonsenseQA task. We further study commonsense reasoning in DNNs using both human and auto-generated explanations including transfer to out-of-domain tasks. Empirical results indicate that we can effectively leverage language models for commonsense reasoning.
While large-scale language models (LMs) are able to imitate the distribution of natural language well enough to generate realistic text, it is difficult to control which regions of the distribution they generate. This is especially problematic because datasets used for training large LMs usually contain significant toxicity, hate, bias, and negativity. One promising approach to address this is to use discriminators to guide decoding from LMs, but existing methods for this are too slow to be useful in practice for many applications. We present GeDi as a significantly more efficient discriminator-based approach for guiding decoding. GeDi guides generation at each step by computing classification probabilities for all possible next tokens via Bayes rule by normalizing over two class-conditional distributions; one conditioned on the desired attribute, or control code, and another conditioned on the undesired attribute, or anti control code. We find that GeDi gives controllability on par with or better than previous controllable generation methods. GeDi results in significantly faster generation speeds than the only previous method that achieved comparable controllability in our experiments. We also show that GeDi can make GPT-2 and GPT-3 significantly less toxic while maintaining linguistic fluency, without sacrificing significantly on generation speed. Lastly, we find training GeDi on only three topics allows us to controllably generate new topics zero-shot from just a keyword.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.