2018
DOI: 10.48550/arxiv.1807.10272
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Evaluating and Understanding the Robustness of Adversarial Logit Pairing

Abstract: We evaluate the robustness of Adversarial Logit Pairing, a recently proposed defense against adversarial examples. We find that a network trained with Adversarial Logit Pairing achieves 0.6% correct classification rate under targeted adversarial attack, the threat model in which the defense is considered. We provide a brief overview of the defense and the threat models/claims considered, as well as a discussion of the methodology and results of our attack. Our results offer insights into the reasons underlying… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
60
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 40 publications
(62 citation statements)
references
References 5 publications
2
60
0
Order By: Relevance
“…For improving adversarial robustness, several studies focus on the flatness or smoothness of the loss landscape in the input data space because adversarial examples are perturbations in the input space [17,19,4,7]. Qin et al [17] have presented a regularization method to flatten the loss landscape in terms of data points.…”
Section: Adversarial Robustnessmentioning
confidence: 99%
“…For improving adversarial robustness, several studies focus on the flatness or smoothness of the loss landscape in the input data space because adversarial examples are perturbations in the input space [17,19,4,7]. Qin et al [17] have presented a regularization method to flatten the loss landscape in terms of data points.…”
Section: Adversarial Robustnessmentioning
confidence: 99%
“…Adversarial robustness. To mitigate threats posed by adversarial examples [29,38], adversarial training [14,25,36,43,11,50,47] solves a min-max optimization problem in which adversarial examples are crafted to maximize the training loss, and these examples are then used to update network parameters during loss minimization. This process can be interpreted as approximately solving the saddlepoint optimization problem:…”
Section: Background and Related Workmentioning
confidence: 99%
“…But the adoption of these models in safety-critical or high-security applications is inhibited due to two major concerns: their brittleness to adversarial attack methods that can make imperceptible modification to inputs and trigger wrong decisions (Szegedy et al, 2013;Papernot et al, 2016a), and the lack of interpretability (Gunning, 2017). Significant progress has also been made towards adversarial robustness (Papernot et al, 2016b;Madry et al, 2017;Engstrom et al, 2018) and explainability (Li & Yu, 2015;Yi et al, 2016;Sundararajan et al, 2017), and a few recent theoretical studies (Kilbertus et al, 2018;Chalasani et al, 2018) indicate a strong connection between these two issues.…”
Section: Introductionmentioning
confidence: 99%
“…• We propose a defense layer based on Kahneman's decomposition of cognition into intuitive System 1 and deliberative System 2. This approach does not rely on analyzing training data such as manifold-based defense (Ilyas et al, 2017;Jha et al, 2018), or statistical signature of the machine learning models such as logit pairing (Engstrom et al, 2018), or methods that exploit the knowledge of specific attack for adversarial training (Tramèr et al, 2017), or robust optimization with L p norm bounds (Madry et al, 2017;Raghunathan et al, 2018).…”
Section: Introductionmentioning
confidence: 99%