Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021 2021
DOI: 10.18653/v1/2021.findings-acl.361
|View full text |Cite
|
Sign up to set email alerts
|

Effective Attention Sheds Light On Interpretability

Abstract: An attention matrix of a transformer selfattention sublayer can provably be decomposed into two components and only one of them (effective attention) contributes to the model output. This leads us to ask whether visualizing effective attention gives different conclusions than interpretation of standard attention. Using a subset of the GLUE tasks and BERT, we carry out an analysis to compare the two attention matrices, and show that their interpretations differ. Effective attention is less associated with the f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 18 publications
0
6
0
Order By: Relevance
“…Attention matrix. The attention matrix as an explanation of individual predictions has been extensively studied in [34,18,37,32]. Although these works have shown through a set of experiments that the correlation between learned attention weights and feature importance is weak, we visualize the attention weights of each head in the Transformer to compare the results with LRP.…”
Section: Methods Lrpmentioning
confidence: 99%
“…Attention matrix. The attention matrix as an explanation of individual predictions has been extensively studied in [34,18,37,32]. Although these works have shown through a set of experiments that the correlation between learned attention weights and feature importance is weak, we visualize the attention weights of each head in the Transformer to compare the results with LRP.…”
Section: Methods Lrpmentioning
confidence: 99%
“…Our benchmarking study provides a perfect test-bed to understand if attention aligns with attribution methods. We compare standard self-attention with effective attention (Brunner et al, 2020;Sun and Marasović, 2021). Further, we measure attribution between input tokens and hidden representations using Hidden Token Attribution (HTA) (Brunner et al, 2020).…”
Section: See Appendix A2 For All Implementation Detailsmentioning
confidence: 99%
“…However, in the Transformer architecture (Vaswani et al, 2017), it has become a means to account for lexical influence and long-range dependencies. It also provides useful information about the importance of a term for the output (Wiegreffe and Pinter, 2019;Brunner et al, 2020;Sun and Marasović, 2021). Here, we use the notion of attention entropy, and EAR's use of it in BERT.…”
Section: Entropy-based Attention Regularizationmentioning
confidence: 99%