2018
DOI: 10.48550/arxiv.1805.09190
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards the first adversarially robust neural network model on MNIST

Lukas Schott,
Jonas Rauber,
Matthias Bethge
et al.

Abstract: Despite much effort, deep neural networks remain highly susceptible to tiny input perturbations and even for MNIST, one of the most common toy datasets in computer vision, no neural network model exists for which adversarial perturbations are large and make semantic sense to humans. We show that even the widely recognized and by far most successful defense by Madry et al. (1) overfits on the L ∞ metric (it's highly susceptible to L 2 and L 0 perturbations), (2) classifies unrecognizable images with high certai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
74
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
2
2
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 39 publications
(75 citation statements)
references
References 12 publications
1
74
0
Order By: Relevance
“…The creators of sparse-rs have shown their framework outperforms all previous black-and white-box attacks, and hence we use this attack within our adversarial training framework and after training to approximately measure the robust accuracy of our classifier. We also utilize the Pointwise Attack [28] to directly compare our results with other 0 -defense techniques [30]. This attack tries to greedily minimize the 0 -norm by first adding salt-and-pepper noise, and then repeatedly resetting perturbed pixels while keeping the image misclassified.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…The creators of sparse-rs have shown their framework outperforms all previous black-and white-box attacks, and hence we use this attack within our adversarial training framework and after training to approximately measure the robust accuracy of our classifier. We also utilize the Pointwise Attack [28] to directly compare our results with other 0 -defense techniques [30]. This attack tries to greedily minimize the 0 -norm by first adding salt-and-pepper noise, and then repeatedly resetting perturbed pixels while keeping the image misclassified.…”
Section: Methodsmentioning
confidence: 99%
“…In this paper we focus on a different setting, where adversarial perturbations are constrained using the 0 -norm. This setting has gained considerable attention [9,27,21,28,29,30] due to applications in object detection [31,32] and NLP [33]. In these applications, robust guarantees against 0 -attacks are specifically important since there is an inherent limit on the number of input features that can be modified.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations