2024
DOI: 10.1002/widm.1567
|View full text |Cite
|
Sign up to set email alerts
|

Adversarial Attacks in Explainable Machine Learning: A Survey of Threats Against Models and Humans

Jon Vadillo,
Roberto Santana,
Jose A. Lozano

Abstract: Reliable deployment of machine learning models such as neural networks continues to be challenging due to several limitations. Some of the main shortcomings are the lack of interpretability and the lack of robustness against adversarial examples or out‐of‐distribution inputs. In this paper, we comprehensively review the possibilities and limits of adversarial attacks for explainable machine learning models. First, we extend the notion of adversarial examples to fit in explainable machine learning scenarios whe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 133 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?