2023
DOI: 10.48550/arxiv.2303.03106
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Rotation Invariant Quantization for Model Compression

Abstract: Post-training Neural Network (NN) model compression is an attractive approach for deploying large, memory-consuming models on devices with limited memory resources. In this study, we investigate the rate-distortion tradeoff for NN model compression. First, we suggest a Rotation-Invariant Quantization (RIQ) technique that utilizes a single parameter to quantize the entire NN model, yielding a different rate at each layer, i.e., mixed-precision quantization. Then, we prove that our rotation-invariant approach is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 37 publications
(45 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?