2021
DOI: 10.48550/arxiv.2105.10304
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Exploring Misclassifications of Robust Neural Networks to Enhance Adversarial Attacks

Abstract: Progress in making neural networks more robust against adversarial attacks is mostly marginal, despite the great efforts of the research community. Moreover, the robustness evaluation is often imprecise, making it difficult to identify promising approaches. We analyze the classification decisions of 19 different state-of-the-art neural networks trained to be robust against adversarial attacks. Our findings suggest that current untargeted adversarial attacks induce misclassification towards only a limited amoun… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
9
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 5 publications
0
9
0
Order By: Relevance
“…the next best method. Additionally, we find that SwARo is performing 4 − 9% better compared to baselines on Jitter attacks, an adversarial attack method that incorporates scale invariance and encourages diverse attack targets with smaller perturbations Schwinn et al [2021]. These results make SwARo more appealing in practice and suggest that our approach to enforcing adversarial perturbations, that consider both positive and negative pairs as well as semantic cluster information, ensures robustness against a diverse set of attack types.…”
Section: White Box Attacksmentioning
confidence: 76%
See 3 more Smart Citations
“…the next best method. Additionally, we find that SwARo is performing 4 − 9% better compared to baselines on Jitter attacks, an adversarial attack method that incorporates scale invariance and encourages diverse attack targets with smaller perturbations Schwinn et al [2021]. These results make SwARo more appealing in practice and suggest that our approach to enforcing adversarial perturbations, that consider both positive and negative pairs as well as semantic cluster information, ensures robustness against a diverse set of attack types.…”
Section: White Box Attacksmentioning
confidence: 76%
“…Ilyas et al [2019] hypothesize that the adversarial vulnerability of neural networks is a direct result of their sensitivity to well-generalizing features that are incomprehensible to humans. Schwinn et al [2021] discover that cross-entropy attacks fail against models with large logits, and propose to add logit noise and enforce scale invariance on the loss to mitigate this limitation and encourage the model to design diverse attack targets. All above methods are originally designed for supervised learning tasks.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…We adopt FGSM [9], I-FGSM [11], C&W [24], TPGD [38], and Jitter [39] as the comparison methods, along with the proposed Mixup-Attack and Mixcut-Attack methods, to conduct the untargeted black-box adversarial attack for both scene classification and semantic segmentation tasks. The perturbation level and the step size α in all methods are fixed to 1.…”
Section: B Experimental Settings and Implementation Detailsmentioning
confidence: 99%