Tristan Gomez scite author profile

Due to the black-box nature of deep learning models, there is a recent development of solutions for visual explanations of CNNs. Given the high cost of user studies, metrics are necessary to compare and evaluate these different methods. In this paper, we critically analyze the Deletion Area Under Curve (DAUC) and Insertion Area Under Curve (IAUC) metrics proposed by Petsiuk et al. (2018). These metrics were designed to evaluate the faithfulness of saliency maps generated by generic methods such as Grad-CAM or RISE. First, we show that the actual saliency score values given by the saliency map are ignored as only the ranking of the scores is taken into account. This shows that these metrics are insufficient by themselves, as the visual appearance of a saliency map can change significantly without the ranking of the scores being modified. Secondly, we argue that during the computation of DAUC and IAUC, the model is presented with images that are out of the training distribution which might lead to an unreliable behavior of the model being explained. To complement DAUC/IAUC, we propose new metrics that quantify the sparsity and the calibration of explanation methods, two previously unstudied properties. Finally, we give general remarks about the metrics studied in this paper and discuss how to evaluate them in a user study.

show abstract

A time-lapse embryo dataset for morphokinetic parameter prediction

Gomez

Feyeux

Boulant

et al. 2022

Data in Brief

View full text Add to dashboard Cite

BR-NPA: A Non-Parametric High-Resolution Attention Model to improve the Interpretability of Attention

Gomez¹,

Ling²,

Fréour³

et al. 2021

Preprint

View full text Add to dashboard Cite

BR-NPA: A non-parametric high-resolution attention model to improve the interpretability of attention

Gomez¹,

Ling²,

Fréour

et al. 2022

Pattern Recognition

View full text Add to dashboard Cite

Comparison of attention models and post-hoc explanation methods for embryo stage identification: a case study

Gomez¹,

Fréour²,

Mouchère³

2022

Preprint

View full text Add to dashboard Cite

An important limitation to the development of AI-based solutions for In Vitro Fertilization (IVF) is the black-box nature of most state-of-the-art models, due to the complexity of deep learning architectures, which raises potential bias and fairness issues. The need for interpretable AI has risen not only in the IVF field but also in the deep learning community in general. This has started a trend in literature where authors focus on designing objective metrics to evaluate generic explanation methods. In this paper, we study the behavior of recently proposed objective faithfulness metrics applied to the problem of embryo stage identification. We benchmark attention models and post-hoc methods using metrics and further show empirically that (1) the metrics produce low overall agreement on the model ranking and (2) depending on the metric approach, either post-hoc methods or attention models are favored. We conclude with general remarks about the difficulty of defining faithfulness and the necessity of understanding its relationship with the type of approach that is favored.

show abstract

Computing and evaluating saliency maps for image classification: a tutorial

Gomez

Mouchère

2023

J. Electron. Imag.

View full text Add to dashboard Cite

Facing the black-box nature of deep learning models for image classification, a popular trend in the literature proposes methods to generate explanations in the form of heat maps indicating the areas that played an important role in the models' decisions. Such explanations are called saliency maps and constitute an active field of research, given that many fundamental questions are yet to be answered: how to compute them efficiently? How to evaluate them? What exactly can they be used for? Given the increasing rate at which papers are produced and the vast amount of literature that is already existing, we propose our study to help newcomers become part of this community and to contribute to the research field. First, the two existing approaches to generate saliency maps are discussed, namely post-hoc methods and attention models. Post-hoc methods are generic algorithms that can be applied to any model from a given class without requiring fine-tuning. On the contrary, attention models are ad-hoc architectures that generate a saliency map during the inference phase to guide the decision. We show that both approaches can be divided into several subcategories and illustrate each of them with one important model or method. Second, we present the current methodologies used to evaluate saliency maps, including objective and subjective protocols, depending on whether or not they involve users. Among objective methods, we notably detail faithfulness metrics and propose an implementation featuring the faithfulness metrics discussed in this paper (https://github. com/TristanGomez44/metrics-saliency-maps).

show abstract

Towards deep learning-powered IVF: A large public benchmark for morphokinetic parameter prediction

Gomez¹,

Feyeux²,

Normand³

et al. 2022

Preprint

View full text Add to dashboard Cite

An important limitation to the development of Artificial Intelligence (AI)-based solutions for In Vitro Fertilization (IVF) is the absence of a public reference benchmark to train and evaluate deep learning (DL) models. In this work, we describe a fully annotated dataset of 756 videos of developing embryos, for a total of 337k images. We applied ResNet, LSTM, and ResNet-3D architectures to our dataset and demonstrate that they overperform algorithmic approaches to automatically annotate stage development phases. Altogether, we propose the first public benchmark that will allow the community to evaluate morphokinetic models. This is the first step towards deep learning-powered IVF. Of note, we propose highly detailed annotations with 16 different development phases, including early cell division phases, but also late cell divisions, phases after morulation, and very early phases, which have never been used before. We postulate that this original approach will help improve the overall performance of deep learning approaches on time-lapse videos of embryo development, ultimately benefiting infertile patients with improved clinical success rates 1

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tristan Gomez

Metrics for Saliency Map Evaluation of Deep Learning Explanation Methods

Metrics for saliency map evaluation of deep learning explanation methods

A time-lapse embryo dataset for morphokinetic parameter prediction

BR-NPA: A Non-Parametric High-Resolution Attention Model to improve the Interpretability of Attention

BR-NPA: A non-parametric high-resolution attention model to improve the interpretability of attention

Comparison of attention models and post-hoc explanation methods for embryo stage identification: a case study

Computing and evaluating saliency maps for image classification: a tutorial

Towards deep learning-powered IVF: A large public benchmark for morphokinetic parameter prediction

Contact Info

Product

Resources

About