Regularized Binary Network Training

Darabi, Sajad; Belbahri, Mouloud; Courbariaux, Matthieu; Nia, Vahid Partovi

doi:10.48550/arxiv.1812.11800

Cited by 22 publications

(51 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…But our proposed RA-BNN outperforms both 8-bit and binary baseline, as even after 5000 bit flips the accuracy only degrades to 37.1 % on ImageNet. [22] 53.0 72.6 RBNN [19] 59.9 81.9 Bi-Real [21] 56.4 79.5 RA-BNN 62.9 84.1…”

Section: Robustness Evaluationmentioning

confidence: 99%

“…However, for a binary weight neural network, with a sufficiently large amount of attack iterations, the attacker can still successfully degrade its accuracy to as low as random guess [16]. More importantly, due to aggressively compressing the floating-point weights (i.e., 32 bits or more) into binary (1 bit), BNN inevitably sacrifices its clean model accuracy by 10-30 %, which is widely discussed in prior works [19][20][21][22][23]. Therefore, in this work, for the first time, our primary goal is to construct a robust and accurate binary neural network (with both binary weight and activation) to simultaneously defend bit-flip attacks and improve clean model accuracy.…”

Section: Introductionmentioning

confidence: 99%

“…Besides, BNN comes with additional computation and memory benefits which makes it a great candidate for mobile and hardware-constrained applications [27]. As a result, many prior works [19,21,22,28] investigated different ways of training a complete BNN. However, a general conclusion among them is that BNN suffers from heavy inference accuracy loss.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

RA-BNN: Constructing Robust & Accurate Binary Neural Network to Simultaneously Defend Adversarial Bit-Flip Attack and Improve Accuracy

Rakin¹,

Li²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

Recently developed adversarial weight attack, a.k.a. bit-flip attack (BFA), has shown enormous success in compromising Deep Neural Network (DNN) performance with an extremely small amount of model parameter perturbation. To defend against this threat, we propose RA-BNN that adopts a complete binary (i.e., for both weights and activation) neural network (BNN) to significantly improve DNN model robustness (defined as the number of bit-flips required to degrade the accuracy to as low as a random guess). However, such an aggressive low bit-width model suffers from poor clean (i.e., no attack) inference accuracy. To counter this, we propose a novel and efficient two-stage network growing method, named Early Growth. It selectively grows the channel size of each BNN layer based on channel-wise binary masks training with Gumbel-Sigmoid function. Apart from recovering the inference accuracy, our RA-BNN after growing also shows significantly higher resistance to BFA. Our evaluation of the CIFAR-10 dataset shows that the proposed RA-BNN can improve the clean model accuracy by ∼2-8 %, compared with a baseline BNN, while simultaneously improving the resistance to BFA by more than 125 ×. Moreover, on ImageNet, with a sufficiently large (e.g., 5,000) amount of bit-flips, the baseline BNN accuracy drops to 4.3 % from 51.9%, while our RA-BNN accuracy only drops to 37.1 % from 60.9 % (9% clean accuracy improvement).

show abstract

Section: Robustness Evaluationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

RA-BNN: Constructing Robust & Accurate Binary Neural Network to Simultaneously Defend Adversarial Bit-Flip Attack and Improve Accuracy

Rakin¹,

Li²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Obviously, the memory usage for weights and activations Figure 1: From top to bottom: original functions in spatial domain, corresponding functions in frequency domain and the difference between the current function and sign function in frequency domain. From left to right: sign function, combination of sine functions, tanh function in [8] and SignSwish function in [5] (short as SS).…”

Section: Introductionmentioning

confidence: 99%

“…For example, DSQ [8] introduced a tanh-alike differentiable asymptotic function to estimate the forward and backward procedures of the conventional sign function. BNN+ [5] used a SignSwish activation function to modify the back-propagation of the original sign function and further introduced a regularization that encourages the weights around binary values. RBNN [22] proposed a training-aware approximation function to replace the sign function when computing the gradient.…”

Section: Introductionmentioning

confidence: 99%

Learning Frequency Domain Approximation for Binary Neural Networks

Xu¹,

Han²,

Xu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Binary neural networks (BNNs) represent original full-precision weights and activations into 1-bit with sign function. Since the gradient of the conventional sign function is almost zero everywhere which cannot be used for back-propagation, several attempts have been proposed to alleviate the optimization difficulty by using approximate gradient. However, those approximations corrupt the main direction of de facto gradient. To this end, we propose to estimate the gradient of sign function in the Fourier frequency domain using the combination of sine functions for training BNNs, namely frequency domain approximation (FDA). The proposed approach does not affect the low-frequency information of the original sign function which occupies most of the overall energy, and high-frequency coefficients will be ignored to avoid the huge computational overhead. In addition, we embed a noise adaptation module into the training phase to compensate the approximation error. The experiments on several benchmark datasets and neural architectures illustrate that the binary network learned using our method achieves the state-of-the-art accuracy.

show abstract

Foothill: A Quasiconvex Regularization for Edge Computing of Deep Neural Networks

Belbahri

Sari

Darabi

et al. 2019

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

Deep neural networks (DNNs) have demonstrated success for many supervised learning tasks, ranging from voice recognition, object detection, to image classification. However, their increasing complexity might yield poor generalization error that make them hard to be deployed on edge devices. Quantization is an effective approach to compress DNNs in order to meet these constraints. Using a quasiconvex base function in order to construct a binary quantizer helps training binary neural networks (BNNs) and adding noise to the input data or using a concrete regularization function helps to improve generalization error. Here we introduce foothill function, an infinitely differentiable quasiconvex function. This regularizer is flexible enough to deform towards L1 and L2 penalties. Foothill can be used as a binary quantizer, as a regularizer, or as a loss. In particular, we show this regularizer reduces the accuracy gap between BNNs and their full-precision counterpart for image classification on ImageNet.

show abstract

Regularized Binary Network Training

Cited by 22 publications

References 6 publications

RA-BNN: Constructing Robust & Accurate Binary Neural Network to Simultaneously Defend Adversarial Bit-Flip Attack and Improve Accuracy

RA-BNN: Constructing Robust & Accurate Binary Neural Network to Simultaneously Defend Adversarial Bit-Flip Attack and Improve Accuracy

Learning Frequency Domain Approximation for Binary Neural Networks

Foothill: A Quasiconvex Regularization for Edge Computing of Deep Neural Networks

Contact Info

Product

Resources

About