Towards better understanding of gradient-based attribution methods for Deep Neural Networks

Ancona, Marco; Ceolini, Enea; Öztireli, Cengiz; Groß, Markus

doi:10.48550/arxiv.1711.06104

Cited by 114 publications

(166 citation statements)

References 15 publications

(23 reference statements)

Supporting

Mentioning

165

Contrasting

Order By: Relevance

“…Similarly, Integrated Gradients computes the average gradient of the output with respect to each input feature by integrating from a baseline to the current feature value [35]. DeepLIFT (Deep Learning Important FeaTures) [36] works with deep NNs and it is a good approximation to Integrated Gradients [37]. Similar to integrated gradients it also defines a "reference activation" which is often viewed as "uninformative" in context, e.g.…”

Section: Feature Importance Methodsmentioning

confidence: 99%

The Role of Explainability in Assuring Safety of Machine Learning in Healthcare

Jia¹,

McDermid²,

Lawton³

et al. 2021

Preprint

View full text Add to dashboard Cite

Established approaches to assuring safety-critical systems and software are difficult to apply to systems employing machine learning (ML). In many cases, ML is used on ill-defined problems, e.g. optimising sepsis treatment, where there is no clear, pre-defined specification against which to assess validity. This problem is exacerbated by the "opaque" nature of ML where the learnt model is not amenable to human scrutiny. Explainable AI methods have been proposed to tackle this issue by producing human-interpretable representations of ML models which can help users to gain confidence and build trust in the ML system. However, there is not much work explicitly investigating the role of explainability for safety assurance in the context of ML development. This paper identifies ways in which explainable AI methods can contribute to safety assurance of ML-based systems. It then uses a concrete ML-based clinical decision support system, concerning weaning of patients from mechanical ventilation, to demonstrate how explainable AI methods can be employed to produce evidence to support safety assurance. The results are also represented in a safety argument to show where, and in what way, explainable AI methods can contribute to a safety case. Overall, we conclude that explainable AI methods have a valuable role in safety assurance of ML-based systems in healthcare but that they are not sufficient in themselves to assure safety.

show abstract

Section: Feature Importance Methodsmentioning

confidence: 99%

The Role of Explainability in Assuring Safety of Machine Learning in Healthcare

Jia¹,

McDermid²,

Lawton³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…the feature of interest is computed pointwise after a smoothing procedure. Symbolic derivatives are commonly used to determine the importance of features for neural networks (Ancona et al 2017). While MEs provide interpretations in terms of prediction changes, most methods provide an interpretation in terms of prediction levels.…”

Section: Interpretable Machine Learningmentioning

confidence: 99%

Marginal Effects for Non-Linear Prediction Functions

Scholbeck¹,

Casalicchio²,

Molnar³

et al. 2022

Preprint

View full text Add to dashboard Cite

Beta coefficients for linear regression models represent the ideal form of an interpretable feature effect. However, for non-linear models and especially generalized linear models, the estimated coefficients cannot be interpreted as a direct feature effect on the predicted outcome. Hence, marginal effects are typically used as approximations for feature effects, either in the shape of derivatives of the prediction function or forward differences in prediction due to a change in a feature value. While marginal effects are commonly used in many scientific fields, they have not yet been adopted as a modelagnostic interpretation method for machine learning models. This may stem from their inflexibility as a univariate feature effect and their inability to deal with the non-linearities found in black box models. We introduce a new class of marginal effects termed forward marginal effects. We argue to abandon derivatives in favor of better-interpretable forward differences. Furthermore, we generalize marginal effects based on forward differences to multivariate changes in feature values. To account for the non-linearity of prediction functions, we introduce a non-linearity measure for marginal effects. We argue against summarizing feature effects of a non-linear prediction function in a single metric such as the average marginal effect. Instead, we propose to partition the feature space to compute conditional average marginal effects on feature subspaces, which serve as conditional feature effect estimates. This work has been partially supported by the German Federal Ministry of Education and Research (BMBF) under Grant No. 01IS18036A. The authors of this work take full responsibilities for its content. We thank the anonymous reviewers for their constructive comments, specifically on structuring the paper, on the line integral for the non-linearity measure, and on the instabilities of decision trees.

show abstract

“…where changes in activations would change the output most (the gradients). An overview of various methods for salience mapping is available elsewhere [32]. The class activation map (CAM)/grad-CAM [33,34] approach builds a map of the input regions that are responsible for a classification by calculating how the different convolutional filters contribute to that classification and building a weighted average of these activations, which can then be projected onto the input image, the operation of CAM is presented schematically in Figure 3.…”

Section: Deep Interpretationsmentioning

confidence: 99%

Interpretable and Explainable Machine Learning for Materials Science and Chemistry

Oviedo,

Ferres,

Buonassisi

et al. 2021

Preprint

View full text Add to dashboard Cite

Towards better understanding of gradient-based attribution methods for Deep Neural Networks

Cited by 114 publications

References 15 publications

The Role of Explainability in Assuring Safety of Machine Learning in Healthcare

The Role of Explainability in Assuring Safety of Machine Learning in Healthcare

Marginal Effects for Non-Linear Prediction Functions

Interpretable and Explainable Machine Learning for Materials Science and Chemistry

Contact Info

Product

Resources

About