The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2015
DOI: 10.48550/arxiv.1510.03009
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Neural Networks with Few Multiplications

Abstract: For most deep learning algorithms training is notoriously time consuming. Since most of the computation in training neural networks is typically spent on floating point multiplications, we investigate an approach to training that eliminates the need for most of these. Our method consists of two parts: First we stochastically binarize weights to convert multiplications involved in computing hidden states to sign changes. Second, while back-propagating error derivatives, in addition to binarizing the weights, we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
59
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 70 publications
(59 citation statements)
references
References 12 publications
0
59
0
Order By: Relevance
“…The DNN baseline model usually adopts float32 bits representation for the weight value. In order to compress the bit representation of the data, various works [15], [16] have proposed different DNN model quantization techniques, including fixed bit-length, ternary, and binary weight representations. The truncated length bit representation reduces DNN model size, computation burden on the hardware platform, and memory bandwidth consumption.…”
Section: A Model Compressionmentioning
confidence: 99%
“…The DNN baseline model usually adopts float32 bits representation for the weight value. In order to compress the bit representation of the data, various works [15], [16] have proposed different DNN model quantization techniques, including fixed bit-length, ternary, and binary weight representations. The truncated length bit representation reduces DNN model size, computation burden on the hardware platform, and memory bandwidth consumption.…”
Section: A Model Compressionmentioning
confidence: 99%
“…Many weight reparameterization approaches have been proposed to solve the challenge. One way is using stochastic weight , Lin et al, 2015, Shayer et al, 2017, but these methods suffer from the slow computation of sampling. Another way is utilizing a quantizer function , Rastegari et al, 2016, Li et al, 2016 to map or threshold continuous weights to discrete values.…”
Section: Related Workmentioning
confidence: 99%
“…A closely related research thread is network quantization [25], where instead of binary, low bitwidth networks and activations are considered. In [68,38], also the network gradients are quantized, reducing the memory and computation footprint in the backward pass.…”
Section: Related Workmentioning
confidence: 99%