2021
DOI: 10.48550/arxiv.2109.09113
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

HPTQ: Hardware-Friendly Post Training Quantization

Abstract: Neural network quantization enables the deployment of models on edge devices. An essential requirement for their hardware efficiency is that the quantizers are hardware-friendly: uniform, symmetric and with power-oftwo thresholds. To the best of our knowledge, current post-training quantization methods do not support all of these constraints simultaneously. In this work we introduce a hardware-friendly post training quantization (HPTQ) framework, which addresses this problem by synergistically combining severa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…2) Post-Training Quantization: The post-training quantization (PTQ) [75], [78], [82], [90] is a conversion technique in which all trained weights and activations of the NN model are converted to some fixed point representation, following some quantization precision established after the training phase. As indicated in Fig.…”
Section: Quantizationmentioning
confidence: 99%
“…2) Post-Training Quantization: The post-training quantization (PTQ) [75], [78], [82], [90] is a conversion technique in which all trained weights and activations of the NN model are converted to some fixed point representation, following some quantization precision established after the training phase. As indicated in Fig.…”
Section: Quantizationmentioning
confidence: 99%
“…However, it may become detrimental where quantization is a mandatory operation for final deployment. For example, many well-known architectures have quantization collapse issues like MobileNet (Howard et al 2017;Sandler et al 2018;Howard et al 2019) and EfficientNet (Tan and Le 2019), which calls for remedy designs or advanced quantization schemes like (Sheng et al 2018;Yun and Wong 2021) and (Bhalgat et al 2020;Habi et al 2021) respectively.…”
Section: Introductionmentioning
confidence: 99%