The platform will undergo maintenance on Sep 14 at about 9:30 AM EST and will be unavailable for approximately 1 hour.
2022
DOI: 10.48550/arxiv.2205.07460
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Diffusion Models for Adversarial Purification

Abstract: Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. These methods do not make assumptions on the form of attack and the classification model, and thus can defend pre-existing classifiers against unseen threats. However, their performance currently falls behind adversarial training methods. In this work, we propose DiffPure that uses diffusion models for adversarial purification: Given an adversarial example, we first diffuse it with a sm… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(19 citation statements)
references
References 19 publications
(40 reference statements)
0
14
0
Order By: Relevance
“…The work of [15] first studied this problem-however they do not obtain significant accuracy improvements, likely due to the fact that diffusion models available at the time that work was done were not good enough. Separately, [18] suggest that diffusion models might be able to provide strong empirical robustness to adversarial examples, as evaluated by robustness under adversarial attacks computed using existing attack algorithms; this is orthogonal to our results.…”
Section: Related Workmentioning
confidence: 53%
“…The work of [15] first studied this problem-however they do not obtain significant accuracy improvements, likely due to the fact that diffusion models available at the time that work was done were not good enough. Separately, [18] suggest that diffusion models might be able to provide strong empirical robustness to adversarial examples, as evaluated by robustness under adversarial attacks computed using existing attack algorithms; this is orthogonal to our results.…”
Section: Related Workmentioning
confidence: 53%
“…The robustness continues to scale with model capacity, and RobArch-L achieves the new SOTA AA accuracy on the Robust-Bench leaderboard. It is important to note that ResNet-50+DiffPure [39] designed a novel AT method via using diffusion models [22] for adversarial purification. Although the method improves the AA accuracy by 5.97 percentage points, our architecture modifications show stronger robustness even without finetuning the Standard-AT method.…”
Section: Robust Architecture Design Resultsmentioning
confidence: 99%
“…Existing research has explored the adversarial example for different generative models yet no proper frameworks have been formulated. Diffusion models are used to improve the adversarial robustness of classifiers (Nie et al, 2022). Kos (Kos et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, the training objective of diffusion models is optimized indirectly through a variational bound and thus is not applicable in the optimization of the adversarial example. For these reasons, existing research only considers diffusion models as assists to improve the robustness of classifiers (Nie et al, 2022), leaving a blank in the formulation of adversarial examples for diffusion models.…”
Section: Introductionmentioning
confidence: 99%