BinaryConnect: Training Deep Neural Networks with binary weights during propagations

Courbariaux, Matthieu; Bengio, Yoshua; David, Jean‐Pierre

doi:10.48550/arxiv.1511.00363

Cited by 53 publications

(92 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Model compression has been extensively studied especially for image-classification tasks, see e.g., [38], [39], [40], [41], [42]. The typical model compression techniques include weight quantization/sparsification [38], [41], [42], network pruning [43], [44], KD [40], [45], and lightweight neural network architecture/operation design [46], [47], [48]. In this work, we mainly focus on model compression for vanilla GANs, i.e., noise-to-image task.…”

Section: Related Workmentioning

confidence: 99%

DGL-GAN: Discriminator Guided Learning for GAN Compression

Tian¹,

Shen²,

Tao³

et al. 2021

Preprint

View full text Add to dashboard Cite

Generative Adversarial Networks (GANs) with high computation costs, e.g., BigGAN and StyleGAN2, have achieved remarkable results in synthesizing high resolution and diverse images with high fidelity from random noises. Reducing the computation cost of GANs while keeping generating photo-realistic images is an urgent and challenging field for their broad applications on computational resource-limited devices. In this work, we propose a novel yet simple Discriminator Guided Learning approach for compressing vanilla GAN, dubbed DGL-GAN. Motivated by the phenomenon that the teacher discriminator may contain some meaningful information, we transfer the knowledge merely from the teacher discriminator via the adversarial function. We show DGL-GAN is valid since empirically, learning from the teacher discriminator could facilitate the performance of student GANs, verified by extensive experimental findings. Furthermore, we propose a two-stage training strategy for training DGL-GAN, which can largely stabilize its training process and achieve superior performance when we apply DGL-GAN to compress the two most representative large-scale vanilla GANs, i.e., StyleGAN2 and BigGAN. Experiments show that DGL-GAN achieves state-of-the-art (SOTA) results on both StyleGAN2 (FID 2.92 on FFHQ with nearly 1/3 parameters of StyleGAN2) and BigGAN (IS 93.29 and FID 9.92 on ImageNet with nearly 1/4 parameters of BigGAN) and also outperforms several existing vanilla GAN compression techniques. Moreover, DGL-GAN is also effective in boosting the performance of original uncompressed GANs, original uncompressed StyleGAN2 boosted with DGL-GAN achieves FID 2.65 on FFHQ, which achieves a new state-of-the-art performance. Code and models are available at https://github.com/yuesongtian/DGL-GAN.

show abstract

Section: Related Workmentioning

confidence: 99%

DGL-GAN: Discriminator Guided Learning for GAN Compression

Tian¹,

Shen²,

Tao³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The concept of Binary Neural Network (BNN) originated from the binary weight neural network (BWNN) [18], and the BWNN only quantizes the bit representation of the weight value into the binary value. However, for the FPGA devices with small on-chip memory, the intermediate activations of the BWNN are still too large to be stored in the on-chip SRAM, and external memory is required.…”

Section: B Binary Complex Neural Networkmentioning

confidence: 99%

“…4) Binarization: There are two types of widely used binarization [18]: deterministic binarization and stochastic binarization. The equation for deterministic binarization is given in Eq.…”

Section: B Building Blocks and Operationsmentioning

confidence: 99%

See 1 more Smart Citation

Binary Complex Neural Network Acceleration on FPGA

Peng,

Zhou,

Weitze

et al. 2021

Preprint

View full text Add to dashboard Cite

Being able to learn from complex data with phase information is imperative for many signal processing applications. Today's real-valued deep neural networks (DNNs) have shown efficiency in latent information analysis but fall short when applied to the complex domain. Deep complex networks (DCN) , in contrast, can learn from complex data, but have high computational costs; therefore, they cannot satisfy the instant decisionmaking requirements of many deployable systems dealing with short observations or short signal bursts. Recent, Binarized Complex Neural Network (BCNN), which integrates DCNs with binarized neural networks (BNN), shows great potential in classifying complex data in real-time. In this paper, we propose a structural pruning based accelerator of BCNN, which is able to provide more than 5000 frames/s inference throughput on edge devices. The high performance comes from both the algorithm and hardware sides. On the algorithm side, we conduct structural pruning to the original BCNN models and obtain 20 × pruning rates with negligible accuracy loss; on the hardware side, we propose a novel 2D convolution operation accelerator for the binary complex neural network. Experimental results show that the proposed design works with over 90% utilization and is able to achieve the inference throughput of 5882 frames/s and 4938 frames/s for complex NIN-Net and ResNet-18 using CIFAR-10 dataset and Alveo U280 Board.

show abstract

“…Quantization-aware training, which directly trains the network with lower precisions [6]. These approaches progressively enabled DNNs to first be quantized to 16-bit fixed point [7], 8-bit fixed point [8], and all the way down to binary precision [9]. The best precision of DNN parameters, however, varies across different NN models, and even across different layers within one model [5], [10].…”

Section: Introductionmentioning

confidence: 99%

Taxonomy and Benchmarking of Precision-Scalable MAC Arrays Under Enhanced DNN Dataflow Representation

Ibrahim,

Mei,

Verhelst

2021

Preprint

View full text Add to dashboard Cite

Reduced-precision and variable-precision multiplyaccumulate (MAC) operations provide opportunities to significantly improve energy efficiency and throughput of DNN accelerators with no/limited algorithmic performance loss, paving a way towards deploying AI applications on resource-constraint edge devices. Accordingly, various precision-scalable MAC array (PSMA) architectures were recently proposed. However, it is difficult to make a fair comparison between those alternatives, as each proposed PSMA is demonstrated in different systems with different technologies. This work aims to provide a clear view on the design space of PSMA and offer insights for selecting the optimal architectures based on designers' needs. First, we introduce a precision-enhanced for-loop representation for DNN dataflows. Next, we use this new representation towards a comprehensive PSMA taxonomy, capable to systematically cover most prominent state-of-the-art PSMAs, as well as uncover new PSMA architectures. Following that, we build a highly parameterized PSMA template that can be design-time configured into a huge subset of the design space spanned by the taxonomy. This allows to fairly and thoroughly benchmark 72 different PSMA architectures. We perform such studies in 28nm technology targeting run-time precision scalability from 8 to 2 bits, operating at 200 MHz and 1 GHz. Analyzing resulting energy efficiency and area breakdowns reveals key design guidelines for PSMA architectures.

show abstract

BinaryConnect: Training Deep Neural Networks with binary weights during propagations

Cited by 53 publications

References 20 publications

DGL-GAN: Discriminator Guided Learning for GAN Compression

DGL-GAN: Discriminator Guided Learning for GAN Compression

Binary Complex Neural Network Acceleration on FPGA

Taxonomy and Benchmarking of Precision-Scalable MAC Arrays Under Enhanced DNN Dataflow Representation

Contact Info

Product

Resources

About