Defending DNN Adversarial Attacks with Pruning and Logits Augmentation

Wang, Siyue; Wang, Xiao; Ye, Shaokai; Zhao, Pu; Lin, Xue

doi:10.1109/globalsip.2018.8646578

Cited by 18 publications

(10 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Their conclusion is not general since their experiments are conducted with a simple three-layer CNN on MNIST dataset [37]. Wang et al [38] proposed a robustness enhancement method combined with model pruning and logits augmentation. Guo et al [32] revealed the relationship between sparsity and robustness.…”

Section: Methods a Sparsity And Robustnessmentioning

confidence: 98%

You Can’t Fool All the Models: Detect Adversarial Samples via Pruning Models

Renxuan¹,

Chen²,

Dong³

et al. 2021

IEEE Access

View full text Add to dashboard Cite

Many adversarial attack methods have investigated the security issue of deep learning models. Previous works on detecting adversarial samples show superior in accuracy but consume too much memory and computing resources. In this paper, we propose an adversarial sample detection method based on pruned models and evaluate four different pruning methods. We find that pruned neural network models are sensitive to adversarial samples, i.e., the pruned models tend to output labels different from the original model when given adversarial samples. Moreover, the pruned model has an extremely small model size and computational cost. Based on the detection result, we further propose a simple but effective defense approach to identify the true label of the adversarial sample. Experiments show that, on average, four different pruning methods outperform the SOTA multi-model based detection method (64.15% and 73.70%) by 28.65% and 18.73% on CIFAR10 and SVHN, respectively, with significantly fewer models used. The FLOPs of our structured pruned model are only 49.41% and 25.62% of the original model. Our defense approach achieves 68.60% and 72.03% average classification accuracy on CIFAR10 and SVHN, exceeding other advanced defense methods.

show abstract

Section: Methods a Sparsity And Robustnessmentioning

confidence: 98%

You Can’t Fool All the Models: Detect Adversarial Samples via Pruning Models

Renxuan¹,

Chen²,

Dong³

et al. 2021

IEEE Access

View full text Add to dashboard Cite

show abstract

“…In this section, we compare our approach and results with recent work that have explored the effect of compression techniques on the robustness of DNNs against adversarial attacks (Wang et al 2018;Dhillon et al 2018;Guo et al 2018;Lin, Gan, and Han 2019;Zhao et al 2018). These works have focused on robustness against input-specific adversarial examples, whereas, to our best knowledge, the analysis of robustness to UAPs proposed in our paper is novel.…”

Section: Related Workmentioning

confidence: 96%

Robustness and Transferability of Universal Attacks on Compressed Models

Matachana,

Co,

Muñoz-González

et al. 2020

Preprint

View full text Add to dashboard Cite

Neural network compression methods like pruning and quantization are very effective at efficiently deploying Deep Neural Networks (DNNs) on edge devices. However, DNNs remain vulnerable to adversarial examples-inconspicuous inputs that are specifically designed to fool these models. In particular, Universal Adversarial Perturbations (UAPs), are a powerful class of adversarial attacks which create adversarial perturbations that can generalize across a large set of inputs. In this work, we analyze the effect of various compression techniques to UAP attacks, including different forms of pruning and quantization. We test the robustness of compressed models to white-box and transfer attacks, comparing them with their uncompressed counterparts on CIFAR-10 and SVHN datasets. Our evaluations reveal clear differences between pruning methods, including Soft Filter and Post-training Pruning. We observe that UAP transfer attacks between pruned and full models are limited, suggesting that the systemic vulnerabilities across these models are different. This finding has practical implications as using different compression techniques can blunt the effectiveness of black-box transfer attacks. We show that, in some scenarios, quantization can produce gradient-masking, giving a false sense of security. Finally, our results suggest that conclusions about the robustness of compressed models to UAP attacks is application dependent, observing different phenomena in the two datasets used in our experiments.

show abstract

“…Reparameterization, STR) [106] 和需要重训练的自动化结构搜索人工蜂群的结构化剪枝策略(Artificial Bee Colony Algorithm, ABC) [107] 。在不同剪枝率下 ResNet50 [34] 在 ImageNet [108] [109] 得到的不同比特位数的 ResNet50 在 CIFAR10 [110] [27] 最初始的方法进行评估，即让一个规模小的学生网络，同时学习图片本身的标签和规模较大的高精度教师网络的输出，从而将知识从大模型转移到小模型。我们这里分别采用 ResNet50、 VGG16 [33] 、 WRN-40-2 [111] 作为教师模型，采用 ShuffleNetV1 [112] [113,114,115] 都对稀疏化神经网络的对抗鲁棒性进行分析，发现当剪枝率比较低时(如 5%到 10%) ，神经网络抵御对抗攻击的鲁棒性有所提高。但这些方法大部分都是在 MNIST 和 CIFAR10 等小规模数据集上进行的测试，在大规模的 ImageNet 上反映情况更加严重一些。适当的对网络进行稀疏化的确可以提高网络的鲁棒性，这也是 L1 和 L2 正则化 [54] 所证明的。但目前的剪枝方法大部分要求 80%以上的剪枝效果，这样高的剪枝率所带来的过度稀疏化反过来会严重影响神经网络的鲁棒性能。 Sehwag 等人 [37] Ye 等人 [36] 在剪枝的重训练过程中加入了对抗训练的优化目标，成功提升了剪枝后模型的对抗鲁棒性能。而 Sehwag 等人 [38] 还有一些研究工作 [35,117,118,119,120] 针对权重的量化的对抗鲁棒性维护提出了解决方案。Lin 等人 [120] 在训练过程中加入权重正则化项使得训练后的量化模型具有更好的对抗鲁棒性。 Shkolnik 等人 [35] 发现均匀量化要比正态量化方式具有更好抵御噪音波动的能力。目前大部分方法都是在训练过程中加入一定的优化策略以维护模型的对抗鲁棒性。因此，针对需要训练的量化方法来说，更有效的对抗训练方法是未来主要的探索方向。同时，如何维护不需要重训练量化模型的对抗鲁棒性，仍然值得进一步探索。另一方面，我们根据第 4 章的结果也可以分析出对抗鲁棒性与噪音鲁棒性是相互独立的，这与 Alfred 等人 [135] 的结论是一致的。当神经网络模型的梯度暴露，往往可以针对性对模型中的弱点进行针对性的扰动。 [122] ，BERT [123] 等大规模模型。其是目前将大模型应用到下游任务中常用的模型压缩方式。目前没有针对模型对抗鲁棒性已经噪音鲁棒性与知识蒸馏方法的系统性分析，本文表 4 中的实验是较为完整的学生模型鲁棒性分析。可以看到知识蒸馏方式相对于剪枝和量化来说，没有特别统一的结论，其鲁棒性不仅与模型容量大小有关，也有可能与学生模型的结构存在关联。但其对抗鲁棒性以及噪音鲁棒性表现较为一致。因此更鲁棒性的网络结构也许是知识蒸馏鲁棒性压缩算法未来值得探究的方向。另一方面，由于知识蒸馏需要对学生模型进行训练，因此在训练阶段可以针对不同任务对学生模型进行针对性的调整。Dabouei 等人 [91] 研究了数据增强对知识蒸馏的影响，证明了一些数据增强方式能够可以传递额外的知识。Goldblum 等人 [121] 进一步在训练学生模型阶段加入对抗样本，从而提升了学生模型的鲁棒性能。因此，可以在学生模型训练阶段引入更多能够提升模型鲁棒性的策略，从而改善现有的知识蒸馏框架。…”

Section: 模型压缩算法鲁棒性分析unclassified

Robustness analysis for compact neural networks

Chen¹,

Peng²

2022

Sci. Sin.-Tech.

View full text Add to dashboard Cite

Defending DNN Adversarial Attacks with Pruning and Logits Augmentation

Cited by 18 publications

References 7 publications

You Can’t Fool All the Models: Detect Adversarial Samples via Pruning Models

You Can’t Fool All the Models: Detect Adversarial Samples via Pruning Models

Robustness and Transferability of Universal Attacks on Compressed Models

Robustness analysis for compact neural networks

Contact Info

Product

Resources

About