Quantization Networks

Yang, Jiwei; Shen, Xu; Xing, Junliang; Tian, Xinmei; Li, Houqiang; Deng, Bing; Huang, Jianqiang; Hua, Xian-Sheng

doi:10.1109/cvpr.2019.00748

Cited by 234 publications

(102 citation statements)

References 18 publications

Supporting

Mentioning

101

Contrasting

Order By: Relevance

“…As a step to enable faster inference, significant interest is shown in the design of custom ASIC NN accelerators [4], [5], [17] targeting both cloud platforms and mobile SoCs. In addition, quantization [18] is leveraged to further improve the inference efficiency. During quantization, both weights and activations are converted to lower-precision numerical representations (e.g., perform INT8 computations in place of FLOAT32).…”

Section: Impact Of Ncfet On Neural Network Inference Acceleratorsmentioning

confidence: 99%

“…Hereafter, when referring to the accuracy of a quantization size, we refer to this average value. As shown in Table 1, ResNet-101 and SqueezeNet are amenable to compression and their accuracy is slightly affected by quantization [7], [18]. On the other hand, [8]- [10], [12] are highly impacted by quantization and their accuracy degrades significantly as the quantization size decreases.…”

Section: Neural Network Inference Evaluationmentioning

confidence: 99%

See 1 more Smart Citation

Impact of NCFET on Neural Network Accelerators

et al. 2021

View full text Add to dashboard Cite

This is the first work to investigate the impact that Negative Capacitance Field-Effect Transistor (NCFET) brings on the efficiency and accuracy of future Neural Networks (NN). NCFET is at the forefront of emerging technologies, especially after it has become compatible with the existing fabrication process of CMOS. Neural Network inference accelerators are becoming ubiquitous in modern SoCs and there is an ever-increasing demand for tighter and tighter throughput constraints and lower energy consumption. To explore the benefits that NCFET brings to NN inference regarding frequency, energy, and accuracy, we investigate different configurations of the multiply-add (MADD) circuit, which is the core computational unit in any NN accelerator. We demonstrate that, compared to the baseline 7nm FinFET technology, its negative capacitance counterpart reduces the energy by 55%, without any frequency reduction. In addition, it enables leveraging higher computational precision, which results to a considerable improvement in the inference accuracy. Importantly, the achieved accuracy improvement comes also together with a significant energy reduction and without any loss in frequency.

show abstract

Section: Impact Of Ncfet On Neural Network Inference Acceleratorsmentioning

confidence: 99%

Section: Neural Network Inference Evaluationmentioning

confidence: 99%

Impact of NCFET on Neural Network Accelerators

et al. 2021

View full text Add to dashboard Cite

show abstract

“…To efficiently execute deep models on the proposed large‐scale visual computing platform, we introduce network quantitation techniques to reduce the computation load [7].…”

Section: Large‐scale Visual Computing Platformmentioning

confidence: 99%

City brain: practice of large‐scale artificial intelligence in the real world

et al. 2019

Self Cite

View full text Add to dashboard Cite

A city is an aggregate of a huge amount of heterogeneous data. However, extracting meaningful values from that data remains a challenge. City Brain is an end-to-end system whose goal is to glean irreplaceable values from big city data, specifically from videos, with the assistance of rapidly evolving artificial intelligence technologies and fast-growing computing capacity. From cognition to optimisation, to decision-making, from search to prediction and ultimately, to intervention, City Brain improves the way to manage the city, as well as the way to live in it. In this study, the authors introduce current practices of the City Brain platform in a few cities in China, including what they can do to achieve the goal and make it a reality. Then they focus on the system overview and key technical details of each component of the City Brain system, from cognition to intervention. Lastly, they present a few deployment cases of City Brain in various cities in China.

show abstract

“…When the weights and activation function outputs are represented using just a single bit, the resulting network is called a binarized neural network (BNN ) [26]. BNNs are a highly popular variant of a quantized DNN [10,40,56,57], as their computing time can be up to 58 times faster, and their memory footprint 32 times smaller, than that of traditional DNNs [45]. There are also network architectures in which some parts of the network are quantized, and others are not [45].…”

Section: Introductionmentioning

confidence: 99%

An SMT-Based Approach for Verifying Binarized Neural Networks

Amir

Barrett

et al. 2021

Tools and Algorithms for the Construction and Analysis of Systems

View full text Add to dashboard Cite

Deep learning has emerged as an effective approach for creating modern software systems, with neural networks often surpassing hand-crafted systems. Unfortunately, neural networks are known to suffer from various safety and security issues. Formal verification is a promising avenue for tackling this difficulty, by formally certifying that networks are correct. We propose an SMT-based technique for verifying binarized neural networks — a popular kind of neural network, where some weights have been binarized in order to render the neural network more memory and energy efficient, and quicker to evaluate. One novelty of our technique is that it allows the verification of neural networks that include both binarized and non-binarized components. Neural network verification is computationally very difficult, and so we propose here various optimizations, integrated into our SMT procedure as deduction steps, as well as an approach for parallelizing verification queries. We implement our technique as an extension to the Marabou framework, and use it to evaluate the approach on popular binarized neural network architectures.

show abstract

Quantization Networks

Cited by 234 publications

References 18 publications

Impact of NCFET on Neural Network Accelerators

Impact of NCFET on Neural Network Accelerators

City brain: practice of large‐scale artificial intelligence in the real world

An SMT-Based Approach for Verifying Binarized Neural Networks

Contact Info

Product

Resources

About