Training deep neural networks with low precision multiplications

Courbariaux, Matthieu; Bengio, Yoshua; David, Jean‐Pierre

doi:10.48550/arxiv.1412.7024

Cited by 108 publications

(141 citation statements)

References 14 publications

Supporting

Mentioning

140

Contrasting

Order By: Relevance

“…However, postquantization yields performance loss, which is more prominent as the precision lowers. In particular, posttraining binarization (1-bit precision) enables the highest model compression and computational speedup but impacts heavily on a classifier's accuracy [27].…”

Section: B Binary Neural Networkmentioning

confidence: 99%

Binary Neural Networks for Memory-Efficient and Effective Visual Place Recognition in Changing Environments

et al. 2022

View full text Add to dashboard Cite

Visual place recognition (VPR) is a robot's ability to determine whether a place was visited before using visual data. While conventional handcrafted methods for VPR fail under extreme environmental appearance changes, those based on convolutional neural networks (CNNs) achieve state-of-the-art performance but result in heavy runtime processes and model sizes that demand a large amount of memory. Hence, CNN-based approaches are unsuitable for resource-constrained platforms, such as small robots and drones. In this article, we take a multistep approach of decreasing the precision of model parameters, combining it with network depth reduction and fewer neurons in the classifier stage to propose a new class of highly compact models that drastically reduces the memory requirements and computational effort while maintaining state-of-the-art VPR performance. To the best of our knowledge, this is the first attempt to propose binary neural networks for solving the VPR problem effectively under changing conditions and with significantly reduced resource requirements. Our best-performing binary neural network, dubbed FloppyNet, achieves comparable VPR performance when considered against its full-precision and deeper counterparts while consuming 99% less memory and increasing the inference speed by seven times.

show abstract

Section: B Binary Neural Networkmentioning

confidence: 99%

Binary Neural Networks for Memory-Efficient and Effective Visual Place Recognition in Changing Environments

et al. 2022

View full text Add to dashboard Cite

show abstract

“…Prior work has proposed schemes for uniform quantization (Courbariaux et al, 2014;Zhou et al, 2016) and nonuniform quantization (Han et al, 2015;Zhu et al, 2016). Uniform quantization uses integer or fixed-point format which can be accelerated with specialized math pipelines and is the focus of this paper.…”

Section: Related Workmentioning

confidence: 99%

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

Dai,

Venkatesan,

Ren

et al. 2021

Preprint

View full text Add to dashboard Cite

Quantization enables efficient acceleration of deep neural networks by reducing model memory footprint and exploiting low-cost integer math hardware units. Quantization maps floating-point weights and activations in a trained model to low-bitwidth integer values using scale factors. Excessive quantization, reducing precision too aggressively, results in accuracy degradation. When scale factors are shared at a coarse granularity across many dimensions of each tensor, effective precision of individual elements within the tensor are limited. To reduce quantization-related accuracy loss, we propose using a separate scale factor for each small vector of (≈16-64) elements within a single dimension of a tensor. To achieve an efficient hardware implementation, the per-vector scale factors can be implemented with low-bitwidth integers when calibrated using a two-level quantization scheme. We find that per-vector scaling consistently achieves better inference accuracy at low precision compared to conventional scaling techniques for popular neural networks without requiring retraining. We also modify a deep learning accelerator hardware design to study the area and energy overheads of per-vector scaling support. Our evaluation demonstrates that per-vector scaled quantization with 4-bit weights and activations achieves 37% area saving and 24% energy saving while maintaining over 75% accuracy for ResNet50 on ImageNet. 4-bit weights and 8-bit activations achieve near-full-precision accuracy for both BERT-base and BERT-large on SQuAD while reducing area by 26% compared to an 8-bit baseline.

show abstract

“…To measure the robustness of VeriDL, we implement two types of server Model compression attack. The attack compresses a trained DNN network with small accuracy degradation [2,7]. To simulate the attack, we setup a fullyconnected network with two hidden layers and sigmoid activation function.…”

Section: Robustness Of Verificationmentioning

confidence: 99%

VeriDL: Integrity Verification of Outsourced Deep Learning Services (Extended Version)

Dong,

Zhang,

Hui

2021

Preprint

View full text Add to dashboard Cite

Deep neural networks (DNNs) are prominent due to their superior performance in many fields. The deep-learning-as-a-service (DLaaS) paradigm enables individuals and organizations (clients) to outsource their DNN learning tasks to the cloud-based platforms. However, the DLaaS server may return incorrect DNN models due to various reasons (e.g., Byzantine failures). This raises the serious concern of how to verify if the DNN models trained by potentially untrusted DLaaS servers are indeed correct. To address this concern, in this paper, we design VeriDL, a framework that supports efficient correctness verification of DNN models in the DLaaS paradigm. The key idea of VeriDL is the design of a small-size cryptographic proof of the training process of the DNN model, which is associated with the model and returned to the client. Through the proof, VeriDL can verify the correctness of the DNN model returned by the DLaaS server with a deterministic guarantee and cheap overhead. Our experiments on four real-world datasets demonstrate the efficiency and effectiveness of VeriDL.

show abstract

Training deep neural networks with low precision multiplications

Cited by 108 publications

References 14 publications

Binary Neural Networks for Memory-Efficient and Effective Visual Place Recognition in Changing Environments

Binary Neural Networks for Memory-Efficient and Effective Visual Place Recognition in Changing Environments

VS-Quant: Per-vector Scaled Quantization for Accurate Low-Precision Neural Network Inference

VeriDL: Integrity Verification of Outsourced Deep Learning Services (Extended Version)

Contact Info

Product

Resources

About