Neural Networks with Few Multiplications

Lin, Zhouhan; Courbariaux, Matthieu; Memisevic, Roland; Bengio, Yoshua

doi:10.48550/arxiv.1510.03009

Cited by 70 publications

(59 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The DNN baseline model usually adopts float32 bits representation for the weight value. In order to compress the bit representation of the data, various works [15], [16] have proposed different DNN model quantization techniques, including fixed bit-length, ternary, and binary weight representations. The truncated length bit representation reduces DNN model size, computation burden on the hardware platform, and memory bandwidth consumption.…”

Section: A Model Compressionmentioning

confidence: 99%

Binary Complex Neural Network Acceleration on FPGA

Peng,

Zhou,

Weitze

et al. 2021

Preprint

View full text Add to dashboard Cite

Being able to learn from complex data with phase information is imperative for many signal processing applications. Today's real-valued deep neural networks (DNNs) have shown efficiency in latent information analysis but fall short when applied to the complex domain. Deep complex networks (DCN) , in contrast, can learn from complex data, but have high computational costs; therefore, they cannot satisfy the instant decisionmaking requirements of many deployable systems dealing with short observations or short signal bursts. Recent, Binarized Complex Neural Network (BCNN), which integrates DCNs with binarized neural networks (BNN), shows great potential in classifying complex data in real-time. In this paper, we propose a structural pruning based accelerator of BCNN, which is able to provide more than 5000 frames/s inference throughput on edge devices. The high performance comes from both the algorithm and hardware sides. On the algorithm side, we conduct structural pruning to the original BCNN models and obtain 20 × pruning rates with negligible accuracy loss; on the hardware side, we propose a novel 2D convolution operation accelerator for the binary complex neural network. Experimental results show that the proposed design works with over 90% utilization and is able to achieve the inference throughput of 5882 frames/s and 4938 frames/s for complex NIN-Net and ResNet-18 using CIFAR-10 dataset and Alveo U280 Board.

show abstract

Section: A Model Compressionmentioning

confidence: 99%

Binary Complex Neural Network Acceleration on FPGA

Peng,

Zhou,

Weitze

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Many weight reparameterization approaches have been proposed to solve the challenge. One way is using stochastic weight , Lin et al, 2015, Shayer et al, 2017, but these methods suffer from the slow computation of sampling. Another way is utilizing a quantizer function , Rastegari et al, 2016, Li et al, 2016 to map or threshold continuous weights to discrete values.…”

Section: Related Workmentioning

confidence: 99%

$S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

Li¹,

Liu²,

Yu³

et al. 2021

Preprint

View full text Add to dashboard Cite

Shift neural networks reduce computation complexity by removing expensive multiplication operations and quantizing continuous weights into low-bit discrete values, which are fast and energy efficient compared to conventional neural networks. However, existing shift networks are sensitive to the weight initialization, and also yield a degraded performance caused by vanishing gradient and weight sign freezing problem. To address these issues, we propose S 3 low-bit re-parameterization, a novel technique for training low-bit shift networks. Our method decomposes a discrete parameter in a sign-sparse-shift 3-fold manner. In this way, it efficiently learns a low-bit network with a weight dynamics similar to full-precision networks and insensitive to weight initialization. Our proposed training method pushes the boundaries of shift neural networks and shows 3-bit shift networks out-performs their full-precision counterparts in terms of top-1 accuracy on ImageNet.Preprint. Under review.

show abstract

“…A closely related research thread is network quantization [25], where instead of binary, low bitwidth networks and activations are considered. In [68,38], also the network gradients are quantized, reducing the memory and computation footprint in the backward pass.…”

Section: Related Workmentioning

confidence: 99%

Boosting binary masks for multi-domain learning through affine transformations

Mancini

Ricci

Caputo

et al. 2020

Machine Vision and Applications

View full text Add to dashboard Cite

In this work, we present a new, algorithm for multi-domain learning. Given a pretrained architecture and a set of visual domains received sequentially, the goal of multi-domain learning is to produce a single model performing a task in all the domains together. Recent works showed how we can address this problem by masking the internal weights of a given original conv-net through learned binary variables. In this work, we provide a general formulation of binary mask based models for multi-domain learning by affine transformations of the original network parameters. Our formulation obtains significantly higher levels of adaptation to new domains, achieving performances comparable to domain-specific models while requiring slightly more than 1 bit per network parameter per additional domain. Experiments on two popular benchmarks showcase the power of our approach, achieving performances close to state-of-the-art methods on the Visual Decathlon Challenge.

show abstract

Neural Networks with Few Multiplications

Cited by 70 publications

References 12 publications

Binary Complex Neural Network Acceleration on FPGA

Binary Complex Neural Network Acceleration on FPGA

$S^3$: Sign-Sparse-Shift Reparametrization for Effective Training of Low-bit Shift Networks

Boosting binary masks for multi-domain learning through affine transformations

Contact Info

Product

Resources

About