2019 International Joint Conference on Neural Networks (IJCNN) 2019
DOI: 10.1109/ijcnn.2019.8851930
|View full text |Cite
|
Sign up to set email alerts
|

Not All Adversarial Examples Require a Complex Defense: Identifying Over-optimized Adversarial Examples with IQR-based Logit Thresholding

Abstract: Detecting adversarial examples currently stands as one of the biggest challenges in the field of deep learning. Adversarial attacks, which produce adversarial examples, increase the prediction likelihood of a target class for a particular data point. During this process, the adversarial example can be further optimized, even when it has already been wrongly classified with 100% confidence, thus making the adversarial example even more difficult to detect. For this kind of adversarial examples, which we refer t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 23 publications
0
1
0
Order By: Relevance
“…They are relatively slow, however, requiring many iterations to reach a good solution. Moreover, since they optimize over the logit space, high-confidence adversarials produced by the C&W attacks may lead to "over-optimized" perturbations that can be easily identified via IQR-thresholding of the logit values, since such values are atypical of benign samples [43].…”
Section: The Carlini-wagner Attacksmentioning
confidence: 99%
“…They are relatively slow, however, requiring many iterations to reach a good solution. Moreover, since they optimize over the logit space, high-confidence adversarials produced by the C&W attacks may lead to "over-optimized" perturbations that can be easily identified via IQR-thresholding of the logit values, since such values are atypical of benign samples [43].…”
Section: The Carlini-wagner Attacksmentioning
confidence: 99%