Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks

Mustafa, Aamir; Khan, Salman; Hayat, Munawar; Goecke, Roland; Shen, Jianbing; Shao, Ling

doi:10.1109/iccv.2019.00348

Cited by 117 publications

(106 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Reference [32] proposed an efficient approach that bring adversarial samples onto the natural image manifold, restoring classification towards correct classes. Reference [33] maximally separated the polytopes of classes by force to learn distinct and distant decision regions for each classes.…”

Section: B Defense Methodsmentioning

confidence: 99%

Adversarial Dual Network Learning With Randomized Image Transform for Restoring Attacked Images

Yuan

2020

IEEE Access

View full text Add to dashboard Cite

We develop a new method for defending deep neural networks against attacks using adversarial dual network learning with randomized nonlinear image transform. We introduce a randomized nonlinear transform to disturb and partially destroy the sophisticated pattern of attack noise. We then design a generative cleaning network to recover the original image content damaged by this nonlinear transform and remove residual attack noise. We also construct a detector network which serves as the dual network for the target classifier to be defended, being able to detect patterns of attack noise. The generative cleaning network and detector network are jointly trained using adversarial learning, fighting against each other to minimize both perceptual loss and adversarial loss. Our extensive experimental results demonstrate that our approach improves the state-of-art by large margins in both white-box and black-box attacks. It significantly improves the classification accuracy for white-box attacks upon the second best method by more than 30% on the SVHN dataset and more than 14% on the challenging CIFAR-10 dataset. INDEX TERMS Adversarial attack, adversarial defense, deep neural network.

show abstract

Section: B Defense Methodsmentioning

confidence: 99%

Adversarial Dual Network Learning With Randomized Image Transform for Restoring Attacked Images

Yuan

2020

IEEE Access

View full text Add to dashboard Cite

show abstract

“…Adversarial training adds adversarial samples to the training process, helping the model to learn how to deal with an attacker [15,27]. Pang et al use an ensemble of models to increase decision robustness [38], while Mustafa et al use class-wise disentanglement to restrict feature maps crossing the decision boundaries [37]. However, Schott et al showed that even building robust classification on the small MNIST data remains an unsolved problem [41].…”

Section: Related Workmentioning

confidence: 99%

Towards Certifiable Adversarial Sample Detection

Shumailov

Zhao

Mullins

et al. 2020

Proceedings of the 13th ACM Workshop on Artificial Intelligence and Security

View full text Add to dashboard Cite

Convolutional Neural Networks (CNNs) are deployed in more and more classification systems, but adversarial samples can be maliciously crafted to trick them, and are becoming a real threat. There have been various proposals to improve CNNs' adversarial robustness but these all suffer performance penalties or have other limitations. In this paper, we offer a new approach in the form of a certifiable adversarial detection scheme, the Certifiable Taboo Trap (CTT). This system, in theory, can provide certifiable guarantees of detectability of a range of adversarial inputs for certain l ∞ sizes. We develop and evaluate several versions of CTT with different defense capabilities, training overheads and certifiability on adversarial samples. In practice, against adversaries with various l p norms, CTT outperforms existing defense methods that focus purely on improving network robustness. We show that CTT has small false positive rates on clean test data, minimal compute overheads when deployed, and can support complex security policies.

show abstract

“…There have been several recent papers showing that using metric learning loss functions during training helps in making neural networks more robust to adversarial examples [22][23][24].…”

Section: Our Taxonomymentioning

confidence: 99%

“…Mustafa, et al [23], used their own variation of the contrastive center-loss [25], that encourages both intra-class compactness and inter-class separation of the feature vectors or logits, which are the activations from the last hidden layer. The center loss [26], is a loss function that encourages the feature vectors for each class to lie close to each other (i.e., it encourages intraclass compactness) and the contrastive center-loss function is a generalization of it that also encourages interclass separation.…”

Section: Our Taxonomymentioning

confidence: 99%

A useful taxonomy for adversarial robustness of Neural Networks

Smith¹

2020

Trends Comput Sci Inf Technol

View full text Add to dashboard Cite

Jacobian regularization [13]), and provable defenses (i.e., Reluplex algorithm [14]). Adversarial training is a form of data augmentation where adversarial examples are added to or replace the benign training data. Adversarial training is an important defense discussed in the literature, and variations have been proposed, such as ensemble adversarial training where the adversarial examples are computed from a set

show abstract

Adversarial Defense by Restricting the Hidden Space of Deep Neural Networks

Cited by 117 publications

References 38 publications

Adversarial Dual Network Learning With Randomized Image Transform for Restoring Attacked Images

Adversarial Dual Network Learning With Randomized Image Transform for Restoring Attacked Images

Towards Certifiable Adversarial Sample Detection

A useful taxonomy for adversarial robustness of Neural Networks

Contact Info

Product

Resources

About