2022
DOI: 10.48550/arxiv.2201.13291
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Metrics for saliency map evaluation of deep learning explanation methods

Abstract: Due to the black-box nature of deep learning models, there is a recent development of solutions for visual explanations of CNNs. Given the high cost of user studies, metrics are necessary to compare and evaluate these different methods. In this paper, we critically analyze the Deletion Area Under Curve (DAUC) and Insertion Area Under Curve (IAUC) metrics proposed by Petsiuk et al. (2018). These metrics were designed to evaluate the faithfulness of saliency maps generated by generic methods such as Grad-CAM or … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
8
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(8 citation statements)
references
References 18 publications
0
8
0
Order By: Relevance
“…The same issue of unnatural inputs was raised by Mase et al (2019). Gomez et al (2022) also point out that insertion and deletion tests only compare the rankings of the inputs. We have chosen to study insertion and deletion tests because they avoid the prohibitive cost of retraining.…”
Section: Related Workmentioning
confidence: 87%
See 1 more Smart Citation
“…The same issue of unnatural inputs was raised by Mase et al (2019). Gomez et al (2022) also point out that insertion and deletion tests only compare the rankings of the inputs. We have chosen to study insertion and deletion tests because they avoid the prohibitive cost of retraining.…”
Section: Related Workmentioning
confidence: 87%
“…The insertion and deletion tests we study have been criticized by Gomez et al (2022) who note that the synthesized images generated in these tests are unnatural and do not resemble the images on which the algorithms were trained. The same issue of unnatural inputs was raised by Mase et al (2019).…”
Section: Related Workmentioning
confidence: 99%
“…The Deletion metric [22,39,46] (↓ lower is better) measures the Area under the Curve (AUC) of the target-class probability as we zero out the top-N highest-attribution pixels at each step in the input image. That is, a faithful AM is expected to have a lower AUC in Deletion.…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…That is, a faithful AM is expected to have a lower AUC in Deletion. For the Insertion metric [22,39,46] (↑ higher is better) we start from a zero image and add top-N highest-attribution pixels at each step until recovering the original image and calculate the AUC of the probability curve. For both Deletion and Insertion, We use the implementation by [38] and N = 448 at each step.…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…However, the financial cost and the difficulty of establishing a correct protocol make this approach difficult. Because of these issues, another trend focuses on designing objective metrics to evaluate generic explanation methods [25,18,5,12]. In this paper, we follow this trend and study the behavior of objective faithfulness metrics recently proposed applied to the problem of embryo stage identification.…”
Section: Introductionmentioning
confidence: 99%