2021
DOI: 10.48550/arxiv.2111.03759
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MQBench: Towards Reproducible and Deployable Model Quantization Benchmark

Abstract: Model quantization has emerged as an indispensable technique to accelerate deep learning inference. While researchers continue to push the frontier of quantization algorithms, existing quantization work is often unreproducible and undeployable. This is because researchers do not choose consistent training pipelines and ignore the requirements for hardware deployments. In this work, we propose Model Quantization Benchmark (MQBench), a first attempt to evaluate, analyze, and benchmark the reproducibility and dep… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 47 publications
0
1
0
Order By: Relevance
“…However, high precision computation requires large computing logic resources and more cycles for hardware implementation. To reduce computation and memory resources, an increasing number of designs based on model quantization have been proposed which quantize the model to 8 bits or even lower to 4 bits [ 35 , 36 , 37 ]. For example, ref.…”
Section: Preliminary and Motivationmentioning
confidence: 99%
“…However, high precision computation requires large computing logic resources and more cycles for hardware implementation. To reduce computation and memory resources, an increasing number of designs based on model quantization have been proposed which quantize the model to 8 bits or even lower to 4 bits [ 35 , 36 , 37 ]. For example, ref.…”
Section: Preliminary and Motivationmentioning
confidence: 99%