2019
DOI: 10.48550/arxiv.1905.02161
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Batch Normalization is a Cause of Adversarial Vulnerability

Abstract: Batch normalization (batch norm) is often used in an attempt to stabilize and accelerate training in deep neural networks. In many cases it indeed decreases the number of parameter updates required to achieve low training error. However, it also reduces robustness to small adversarial input perturbations and noise by double-digit percentages, as we show on five standard datasets. Furthermore, substituting weight decay for batch norm is sufficient to nullify the relationship between adversarial vulnerability an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
24
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 19 publications
(24 citation statements)
references
References 11 publications
0
24
0
Order By: Relevance
“…We also define the DENSE set, since networks with many operations per cell and complex connectivity are underexplored in the literature despite their potential [27]. Next, we define the BN-FREE set that is of interest due to BN's potential negative side-effects [52,53] and the difficulty or unnecessity of using it in some cases [54-56, 36, 37]. We finally add the RESNET/VIT set with two predefined image classification architectures: commonlyused ResNet-50 [8] and a smaller 12-layer version of the Visual Transformer (ViT) [20] that has recently received a lot of attention in the vision community.…”
Section: Deepnets-1mmentioning
confidence: 99%
“…We also define the DENSE set, since networks with many operations per cell and complex connectivity are underexplored in the literature despite their potential [27]. Next, we define the BN-FREE set that is of interest due to BN's potential negative side-effects [52,53] and the difficulty or unnecessity of using it in some cases [54-56, 36, 37]. We finally add the RESNET/VIT set with two predefined image classification architectures: commonlyused ResNet-50 [8] and a smaller 12-layer version of the Visual Transformer (ViT) [20] that has recently received a lot of attention in the vision community.…”
Section: Deepnets-1mmentioning
confidence: 99%
“…In [19], it is argued that ReLU activated neural networks would always have open decision boundaries which leave the risk of high responses for unseen OOD samples. In another paper, it is argued that batch normalization is also a cause of the adversarial vulnerability [6]. Such network vulnerability are hard to be reflected by the clean medical image benchmark datasets.…”
Section: Introductionmentioning
confidence: 99%
“…For AFF, following the settings in[27], we also use 10-step ∞ PGD attack with = 8/255 to generate adversarial perturbations during training, and train the entire network parameters f θ and the linear classifier with trades loss for 25 epochs with initial learning rate of 0.1 which decreases to 0.1× at epoch 15, 20. We report the AA, RA and SA for the best possible model for every method under every setting.TRIBN: Customized batch normalization It has recently been shown in[27,62,63] that batch normalization (BN) could play a vital role in robust training with 'mixed' normal and adversarial data. Thus, a careful study on the BN strategy of ADVCL is needed, since two types of adversarial perturbations are generated in Eq.…”
mentioning
confidence: 99%
“…Besides, we use the other BN for normally transformed data, i.e., (τ 1 (x), τ 2 (x), x h ). Compared with existing work[27,62,63] that used 2 BNs (one for adversarial data and the other for benign data), our proposed ADVCL calls for triple BNs (TRIBN).…”
mentioning
confidence: 99%