Codelivery of Anticancer Drugs and siRNA By Mesoporous Silica Nanoparticles

Post-hoc interpretability approaches have been proven to be powerful tools to generate explanations for the predictions made by a trained blackbox model. However, they create the risk of having explanations that are a result of some artifacts learned by the model instead of actual knowledge from the data. This paper focuses on the case of counterfactual explanations and asks whether the generated instances can be justified, i.e. continuously connected to some ground-truth data. We evaluate the risk of generating unjustified counterfactual examples by investigating the local neighborhoods of instances whose predictions are to be explained and show that this risk is quite high for several datasets. Furthermore, we show that most state of the art approaches do not differentiate justified from unjustified counterfactual examples, leading to less useful explanations.

show abstract

Comparison-Based Inverse Classification for Interpretability in Machine Learning

Laugel

Lesot

Marsala

et al. 2018

View full text Add to dashboard Cite

In the context of post-hoc interpretability, this paper addresses the task of explaining the prediction of a classifier, considering the case where no information is available, neither on the classifier itself, nor on the processed data (neither the training nor the test data). It proposes an instance-based approach whose principle consists in determining the minimal changes needed to alter a prediction: given a data point whose classification must be explained, the proposed method consists in identifying a close neighbour classified differently, where the closeness definition integrates a sparsity constraint. This principle is implemented using observation generation in the Growing Spheres algorithm. Experimental results on two datasets illustrate the relevance of the proposed approach that can be used to gain knowledge about the classifier.

show abstract

The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations

Laugel¹,

Lesot²,

Marsala³

et al. 2019

Preprint

View full text Add to dashboard Cite

Unjustified Classification Regions and Counterfactual Explanations in Machine Learning

Laugel¹,

Lesot²,

Marsala³

et al. 2020

View full text Add to dashboard Cite

Defining Locality for Surrogates in Post-hoc Interpretablity

Laugel¹,

Renard²,

Lesot³

et al. 2018

Preprint

View full text Add to dashboard Cite

How to Choose an Explainability Method? Towards a Methodical Implementation of XAI in Practice

Vermeire

Laugel

Renard

et al. 2021

View full text Add to dashboard Cite

Detecting Potential Local Adversarial Examples for Human-Interpretable Defense

Renard

Laugel

Lesot

et al. 2019

View full text Add to dashboard Cite

Machine learning models are increasingly used in the industry to make decisions such as credit insurance approval. Some people may be tempted to manipulate specific variables, such as the age or the salary, in order to get better chances of approval. In this ongoing work, we propose to discuss, with a first proposition, the issue of detecting a potential local adversarial example on classical tabular data by providing to a human expert the locally critical features for the classifier's decision, in order to control the provided information and avoid a fraud.

show abstract

Understanding surrogate explanations: the interplay between complexity, fidelity and coverage

Poyiadzi¹,

Renard²,

Laugel³

et al. 2021

Preprint

View full text Add to dashboard Cite

This paper analyses the fundamental ingredients behind surrogate explanations to provide a better understanding of their inner workings. We start our exposition by considering global surrogates, describing the trade-off between complexity of the surrogate and fidelity to the black-box being modelled. We show that transitioning from global to local -reducing coverage -allows for more favourable conditions on the Pareto frontier of fidelity-complexity of a surrogate. We discuss the interplay between complexity, fidelity and coverage, and consider how different user needs can lead to problem formulations where these are either constraints or penalties. We also present experiments that demonstrate how the local surrogate interpretability procedure can be made interactive and lead to better explanations.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Thibault Laugel

The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations

Comparison-Based Inverse Classification for Interpretability in Machine Learning

The Dangers of Post-hoc Interpretability: Unjustified Counterfactual Explanations

Unjustified Classification Regions and Counterfactual Explanations in Machine Learning

Defining Locality for Surrogates in Post-hoc Interpretablity

How to Choose an Explainability Method? Towards a Methodical Implementation of XAI in Practice

Detecting Potential Local Adversarial Examples for Human-Interpretable Defense

Understanding surrogate explanations: the interplay between complexity, fidelity and coverage

Contact Info

Product

Resources

About