2022
DOI: 10.48550/arxiv.2202.07304
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

XAI for Transformers: Better Explanations through Conservative Propagation

Abstract: Transformers have become an important workhorse of machine learning, with numerous applications. This necessitates the development of reliable methods for increasing their transparency. Multiple interpretability methods, often based on gradient information, have been proposed. We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the contribution of input features to the prediction. We identify Attention Heads and LayerNorm as main reasons for such u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 30 publications
(46 reference statements)
0
2
0
Order By: Relevance
“…They are based on a per-class weighted linear sum of visual patterns present at various spatial locations in an image and produce heatmaps representations that indicate which regions of the input image were most important for CNN’s decisions. Recently, there are initial attempts to use Grad-CAM on transformer architectures but their effectiveness is still on debate [ 46 , 47 ]. However, thanks to the attention mechanism, transformers are intrinsically able to support explanations based on the inspection of the weights in the attention matrices, like the Attention Rollout [ 48 ].…”
Section: Discussionmentioning
confidence: 99%
“…They are based on a per-class weighted linear sum of visual patterns present at various spatial locations in an image and produce heatmaps representations that indicate which regions of the input image were most important for CNN’s decisions. Recently, there are initial attempts to use Grad-CAM on transformer architectures but their effectiveness is still on debate [ 46 , 47 ]. However, thanks to the attention mechanism, transformers are intrinsically able to support explanations based on the inspection of the weights in the attention matrices, like the Attention Rollout [ 48 ].…”
Section: Discussionmentioning
confidence: 99%
“…In Kokalj et al (2021), the known feature importance XAI approach 'shapley additive explanations' (Lundberg & Lee, 2017) has been adapted to account for the contextualized (token-based) text representation in language models. Further, approaches based on the attention weights in language models have been recently proposed (Ali et al, 2022;S. Liu et al, 2021), similarly establishing feature importance scores for language model predictions.…”
Section: Ad Category B)mentioning
confidence: 99%