2018
DOI: 10.48550/arxiv.1809.02104
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Are adversarial examples inevitable?

Abstract: A wide range of defenses have been proposed to harden neural networks against adversarial attacks. However, a pattern has emerged in which the majority of adversarial defenses are quickly broken by new attacks. Given the lack of success at generating robust defenses, we are led to ask a fundamental question: Are adversarial attacks inevitable? This paper analyzes adversarial examples from a theoretical perspective, and identifies fundamental bounds on the susceptibility of a classifier to adversarial attacks. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
57
1

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 48 publications
(61 citation statements)
references
References 31 publications
(39 reference statements)
3
57
1
Order By: Relevance
“…Daniely et al [312] provided a theoretical analysis that studies the vulnerability of ReLU networks against adversarial perturbations, concluding that most ReLU networks suffer from 2 perturbations. We also find a similar but broader claim that adversarial examples are inevitable for certain types of problems in [313].…”
Section: A On Input-specific Perturbationssupporting
confidence: 68%
“…Daniely et al [312] provided a theoretical analysis that studies the vulnerability of ReLU networks against adversarial perturbations, concluding that most ReLU networks suffer from 2 perturbations. We also find a similar but broader claim that adversarial examples are inevitable for certain types of problems in [313].…”
Section: A On Input-specific Perturbationssupporting
confidence: 68%
“…Sometimes the defense strategies reflect different interpretations of the underlying causes of NN vulnerability to adversarial attacks. The insufficient mapping of the input space during training [29] was argued to be a consequence of the high dimensionality of the input or its non-linear processing [28], [30], [31], in which case a dimensionality reduction could help circumvent many attacks. Similar motivation led to different loss function or activation function modifications in order to enhance the model's robustness [4], [32].…”
Section: B Defense Methodsmentioning
confidence: 99%
“…The above techniques can be applied on an ensemble of models, and thus increase the probability of creating a transferable attack, or even a universal attack [23]. Some attacks target other aspects of NN computation; for example, they attempt to change the heatmaps produced by various interpretation methods [24], [25], or attack through model manipulation [26] or through poisoning the training data [27], [28] rather than through input perturbations.…”
Section: A Adversarial Attacksmentioning
confidence: 99%
“…However, the existence of adversarial examples and the large number of different techniques available for finding them [1,7,8,9,10,11] clearly prove the shortcoming of our intuition with respect to the structure of those decision boundaries. In recent years it has become quite clear that class label changes can occur within a small distance from a correctly classified input [12,13,14] and that the structure of the decision boundary in proximity to correctly classified input is far from being spherical.…”
Section: Adversarial Examples Are Defined By a Classifier's Decision ...mentioning
confidence: 99%