2020
DOI: 10.1016/j.patcog.2020.107198
|View full text |Cite
|
Sign up to set email alerts
|

Towards explaining anomalies: A deep Taylor decomposition of one-class models

Abstract: A common machine learning task is to discriminate between normal and anomalous data points. In practice, it is not always sufficient to reach high accuracy at this task, one also would like to understand why a given data point has been predicted in a certain way. We present a new principled approach for one-class SVMs that decomposes outlier predictions in terms of input variables. The method first recomposes the one-class model as a neural network with distance functions and min-pooling, and then performs a d… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
58
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 74 publications
(61 citation statements)
references
References 45 publications
(63 reference statements)
0
58
0
Order By: Relevance
“…Such explanations help to verify the predictions and establish trust in the correct functioning on the system. Layer-wise Relevance Propagation (LRP) [9,58] provides a general framework for explaining individual predictions, i.e., it is applicable to various ML models, including neural networks [9], LSTMs [7], Fisher Vector classifiers [44] and Support Vector Machines [35]. Section 4 gives an overview over recently proposed methods for computing individual explanations.…”
Section: Explaining Individual Predictionsmentioning
confidence: 99%
See 1 more Smart Citation
“…Such explanations help to verify the predictions and establish trust in the correct functioning on the system. Layer-wise Relevance Propagation (LRP) [9,58] provides a general framework for explaining individual predictions, i.e., it is applicable to various ML models, including neural networks [9], LSTMs [7], Fisher Vector classifiers [44] and Support Vector Machines [35]. Section 4 gives an overview over recently proposed methods for computing individual explanations.…”
Section: Explaining Individual Predictionsmentioning
confidence: 99%
“…The propagation process can be theoretically embedded in the deep Taylor decomposition framework [59]. More recently, LRP was extended to a wider set of machine learning models, e.g., in clustering [36] or anomaly detection [35], by first transforming the model into a neural network ('neuralization') and then applying LRP to explain its predictions. The leveraging of the model structure together with the use of appropriate (theoretically-motivated) propagation rules, enables LRP to deliver good explanations at very low computational cost (one forward and one backward pass).…”
Section: Propagation-based Approaches (Leveraging Structure)mentioning
confidence: 99%
“…(Ribeiro et al, 2016). Recently, some interpretation methods have emerged to understand models beyond classification tasks (Samek et al, 2020;Kauffmann et al, 2020;, including the one we present in this paper for the purpose of cluster explanation. ACE's perturbation approach draws inspiration from adversarial machine learning (Xu et al, 2020) where imperceivable perturbations are maliciously crafted to mislead a machine learning model to predict incorrect outputs.…”
Section: Related Workmentioning
confidence: 94%
“…Some approaches have been extended to unsupervised models, e.g. anomaly detection [38], [39] and clustering [40], and attention models have also been developed to explain tasks different from classification such as image captioning [41] or similarity [42]. Our work goes further along this direction and explains similarity built on general neural network models, and by identifying relevant pairs of input features.…”
Section: Related Workmentioning
confidence: 99%
“…jk (x) = min a j (x), τ (a k (x)) The 'min' operation be interpreted as a continuous 'AND' [38], and tests at each location for the presence of bigrams jk ∈ 00-99. The function τ represents some translation operation, and we apply several of them to produce candidate alignments between the digits forming the bigrams (e.g.…”
Section: The 'Bigram Network'mentioning
confidence: 99%