Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks

Papernot, Nicolas; McDaniel, Patrick; Wu, Xi; Jha, Somesh; Swami, Ananthram

doi:10.1109/sp.2016.41

Cited by 2,428 publications

(1,895 citation statements)

References 30 publications

Supporting

Mentioning

1,804

Contrasting

Unclassified

Order By: Relevance

“…Consequently, Papernot, et. al [14] proposed a technique named Defensive Distillation, which is also based on retraining the network on a dimensionally-reduced set of training data. This approach, too, was recently shown to be insufficient in mitigating adversarial examples [22].…”

Section: Performance Of Proposed Policy Induction Attackmentioning

confidence: 99%

“…[7], the results of which verify the feasibility of policy induction attacks by incurring minimal perturbations in the environment or sensory inputs of an RL system. We also discuss the insufficiency of defensive distillation [14] and adversarial training [15] techniques as state of the art countermeasures proposed against adversarial example attacks on deep learning classifiers, and present potential techniques to mitigate the effect of policy induction attacks against DQNs.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks

Behzadan

Munir

2017

Lecture Notes in Computer Science

186

158

View full text Add to dashboard Cite

Abstract. Deep learning classifiers are known to be inherently vulnerable to manipulation by intentionally perturbed inputs, named adversarial examples. In this work, we establish that reinforcement learning techniques based on Deep Q-Networks (DQNs) are also vulnerable to adversarial input perturbations, and verify the transferability of adversarial examples across different DQN models. Furthermore, we present a novel class of attacks based on this vulnerability that enable policy manipulation and induction in the learning process of DQNs. We propose an attack mechanism that exploits the transferability of adversarial examples to implement policy induction attacks on DQNs, and demonstrate its efficacy and impact through experimental study of a game-learning scenario.

show abstract

Section: Performance Of Proposed Policy Induction Attackmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks

Behzadan

Munir

2017

Lecture Notes in Computer Science

186

158

View full text Add to dashboard Cite

show abstract

“…However, the underlying architecture is straightforward when it comes to facilitating the ow of information forwards and backwards, greatly alleviating the e ort in generating adversarial samples. erefore, several ideas [12,23] have been proposed to enhance the complexity of DNN models,…”

Section: Enhancing Model Complexitymentioning

confidence: 99%

“…Once the two approximating DNN models are learned, the a acker can generate adversarial samples speci c to this distillation-enhanced DNN model. Similar to [23], [12] proposed to stack an auto-encoder together with a standard DNN. It shows that this auto-encoding enhancement increases a DNN's resistance to adversarial samples.…”

Section: Input Nullificationmentioning

confidence: 99%

See 1 more Smart Citation

Adversary Resistant Deep Neural Networks with an Application to Malware Detection

Wang

Guo

Zhang

et al. 2017

Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

134

View full text Add to dashboard Cite

Beyond its highly publicized victories in Go, there have been numerous successful applications of deep learning in information retrieval, computer vision and speech recognition. In cybersecurity, an increasing number of companies have become excited about the potential of deep learning, and have started to use it for various security incidents, the most popular being malware detection.ese companies assert that deep learning (DL) could help turn the tide in the ba le against malware infections. However, deep neural networks (DNNs) are vulnerable to adversarial samples, a aw that plagues most if not all statistical learning models. Recent research has demonstrated that those with malicious intent can easily circumvent deep learning-powered malware detection by exploiting this aw.In order to address this problem, previous work has developed various defense mechanisms that either augmenting training data or enhance model's complexity. However, a er a thorough analysis of the fundamental aw in DNNs, we discover that the e ectiveness of current defenses is limited and, more importantly, cannot provide theoretical guarantees as to their robustness against adversarial sampled-based a acks. As such, we propose a new adversary resistant technique that obstructs a ackers from constructing impactful adversarial samples by randomly nullifying features within samples. In this work, we evaluate our proposed technique against a real world dataset with 14,679 malware variants and 17,399 benign programs. We theoretically validate the robustness of our technique, and empirically show that our technique signi cantly boosts DNN robustness to adversarial samples while maintaining high accuracy in classi cation. To demonstrate the general applicability of our proposed method, we also conduct experiments using the MNIST and CIFAR-10 datasets, generally used in image recognition research. ACM Reference format:

show abstract

Adversarial Examples: Challenges and Solutions

2022

Cybersecurity in Intelligent Networking Systems

View full text Add to dashboard Cite

Distillation as a Defense to Adversarial Perturbations Against Deep Neural Networks

Cited by 2,428 publications

References 30 publications

Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks

Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks

Adversary Resistant Deep Neural Networks with an Application to Malware Detection

Adversarial Examples: Challenges and Solutions

Contact Info

Product

Resources

About