2017
DOI: 10.48550/arxiv.1711.00867
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The (Un)reliability of saliency methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
61
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 69 publications
(64 citation statements)
references
References 0 publications
2
61
0
Order By: Relevance
“…Robustness to input perturbations: Within this category the majority of works focused on analysing the robustness of gradientbased saliency maps that are specific to analysing neural network models (differentiable models). For example, Kindermans et al [18] demonstrated that perturbing inputs by simply adding a constant shift causes several gradient-based saliency methods to attribute incorrectly. Others designed novel objective functions to demonstrate that most of the popular saliency methods can be forced to generate arbitrary explanations and attributed this to certain geometrical properties of neural networks (e.g., shape of decision boundary) [10,12].…”
Section: Feature Importance Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Robustness to input perturbations: Within this category the majority of works focused on analysing the robustness of gradientbased saliency maps that are specific to analysing neural network models (differentiable models). For example, Kindermans et al [18] demonstrated that perturbing inputs by simply adding a constant shift causes several gradient-based saliency methods to attribute incorrectly. Others designed novel objective functions to demonstrate that most of the popular saliency methods can be forced to generate arbitrary explanations and attributed this to certain geometrical properties of neural networks (e.g., shape of decision boundary) [10,12].…”
Section: Feature Importance Methodsmentioning
confidence: 99%
“…• Robustness to input perturbations: This scenario involves keeping the machine learning model unchanged and analysing the behaviour of explainability methods to slight perturbations to model inputs [2,10,12,18,30]. Such input perturbations could be introduced deliberately by an adversary or could result from changes in data distribution.…”
Section: Taxonomy Of Robustness Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Other papers bring up criticism to some of the methods we just described. In [14], it is argued that saliency methods lack reliability when the explanation is sensitive to factors that do not contribute to the model prediction, and in [15] it is shown that DeConvNets and Guided Backpropagation do not produce the theoretically correct explanations for a linear model, and so even less for a multi-layer network with millions of parameters. Finally, in [9] and [18], the authors propose that neurons do not encode single concepts and that they are in fact multifaceted, with some concepts being encoded by a group of neurons rather than by a sole neuron by itself.…”
Section: Related Work On Interpretability and Explainability Of Neura...mentioning
confidence: 99%
“…As discussed earlier, gradient based reconstruction methods might not be ideal for explaining a CNN's reasoning process [17]. Here however, we only use it to focus the reconstruction on salient regions of the agent and do not use it to explain the agent's behavior for which these methods are ideally suited.…”
Section: State Modelmentioning
confidence: 99%