Attention is not not Explanation

Wiegreffe, Sarah; Pinter, Yuval

doi:10.18653/v1/d19-1002

Cited by 608 publications

(541 citation statements)

References 15 publications

(12 reference statements)

Supporting

Mentioning

487

Contrasting

Order By: Relevance

“…The present analysis identifies associations between attention and various properties of proteins. It does not attempt to establish a causal link between attention and model behavior [28,84], nor to explain model predictions [35,87]. While the focus of this paper is reconciling attention patterns with known properties of proteins, one could also leverage attention to discover novel types of properties and processes.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

BERTology Meets Biology: Interpreting Attention in Protein Language Models

Vig

Madani

Varshney

et al. 2020

Preprint

187

169

View full text Add to dashboard Cite

Transformer architectures have proven to learn useful representations for protein classification and generation tasks. However, these representations present challenges in interpretability. Through the lens of attention, we analyze the inner workings of the Transformer and explore how the model discerns structural and functional properties of proteins. We show that attention (1) captures the folding structure of proteins, connecting amino acids that are far apart in the underlying sequence, but spatially close in the three-dimensional structure, (2) targets binding sites, a key functional component of proteins, and (3) focuses on progressively more complex biophysical properties with increasing layer depth. We also present a three-dimensional visualization of the interaction between attention and protein structure. Our findings align with known biological processes and provide a tool to aid discovery in protein engineering and synthetic biology. The code for visualization and analysis is available at https://github.com/salesforce/provis.

show abstract

Section: Discussionmentioning

confidence: 99%

“…Interpreting attention on natural language sequences is a well-established area of research [12,30,87,90]. In some cases, it has been shown that attention correlates with syntactic and semantic relationships in natural language [15,32,83].…”

Section: Interpreting Models In Nlpmentioning

confidence: 99%

BERTology Meets Biology: Interpreting Attention in Protein Language Models

Vig

Madani

Varshney

et al. 2020

Preprint

187

169

View full text Add to dashboard Cite

show abstract

“…It then becomes hard if not impossible to pinpoint the reasons behind the wrong output of a neural architecture. Interestingly, attention could provide a key to partially interpret and explain neural network behavior [5]- [9], even if it cannot be considered a reliable means of explanation [10], [11]. For instance, the weights computed by attention could point us to relevant information discarded by the neural network or to irrelevant elements of the input source that have been factored in and could explain a surprising output of the neural network.…”

Section: Introductionmentioning

confidence: 99%

Attention in Natural Language Processing

Galassi

Lippi

Torroni

2021

IEEE Trans. Neural Netw. Learning Syst.

407

130

View full text Add to dashboard Cite

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

show abstract

“…Recently, there has been a debate on whether attention can be used to explain model decisions (Serrano and Smith, 2019;Jain and Wallace, 2019;Wiegreffe and Pinter, 2019), we thus present additional analysis of our proposed method based on saliency maps (Ding et al, 2019). Saliency maps have been shown to better capture word alignment than attention probabilities in neural machine translation.…”

Section: Discussionmentioning

confidence: 99%

Topics to Avoid: Demoting Latent Confounds in Text Classification

Kumar¹,

Wintner²,

Smith³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Despite impressive performance on many text classification tasks, deep neural networks tend to learn frequent superficial patterns that are specific to the training data and do not always generalize well. In this work, we observe this limitation with respect to the task of native language identification. We find that standard text classifiers which perform well on the test set end up learning topical features which are confounds of the prediction task (e.g., if the input text mentions Sweden, the classifier predicts that the author's native language is Swedish). We propose a method that represents the latent topical confounds and a model which "unlearns" confounding features by predicting both the label of the input text and the confound; but we train the two predictors adversarially in an alternating fashion to learn a text representation that predicts the correct label but is less prone to using information about the confound. We show that this model generalizes better and learns features that are indicative of the writing style rather than the content. 1

show abstract

Attention is not not Explanation

Cited by 608 publications

References 15 publications

BERTology Meets Biology: Interpreting Attention in Protein Language Models

BERTology Meets Biology: Interpreting Attention in Protein Language Models

Attention in Natural Language Processing

Topics to Avoid: Demoting Latent Confounds in Text Classification

Contact Info

Product

Resources

About