Proceedings of the 30th ACM International Conference on Information &Amp; Knowledge Management 2021
DOI: 10.1145/3459637.3482126
|View full text |Cite
|
Sign up to set email alerts
|

Grad-SAM

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 28 publications
0
5
0
Order By: Relevance
“…Some more recent works have also proposed versions of post-hoc algorithms tailored for the transformer model. 27,28…”
Section: Transformer Approachmentioning
confidence: 99%
“…Some more recent works have also proposed versions of post-hoc algorithms tailored for the transformer model. 27,28…”
Section: Transformer Approachmentioning
confidence: 99%
“…However, this summary still only includes the attention layers and neglects all other network components [47]. In response, various improvements over attention rollout have been proposed, such as GradSAM [48] or an LRP-based explanation method [49], that were designed to more accurately reflect the computations of all model components.…”
Section: Explaining Attention-based Modelsmentioning
confidence: 99%
“…For instance, Chefer et al (2021) utilize the Taylor Decomposition principle to assign and propagate a local relevance score through the layers of a ViT model. Similarly, Sun et al (2021) and Barkan et al (2021) employ attention gradient weighting on ViT and BERT models, respectively. However, these approaches primarily focused on the attention weight of the "cls" token, and the latter two methods weighed each token's attention weight through element-wise multiplication.…”
Section: Layer Attention Map Generationmentioning
confidence: 99%