2022
DOI: 10.48550/arxiv.2203.05492
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

An Empirical Study of Low Precision Quantization for TinyML

Abstract: Tiny machine learning (tinyML) has emerged during the past few years aiming to deploy machine learning models to embedded AI processors with highly constrained memory and computation capacity. Low precision quantization is an important model compression technique that can greatly reduce both memory consumption and computation cost of model inference. In this study, we focus on post-training quantization (PTQ) algorithms that quantize a model to low-bit (less than 8-bit) precision with only a small set of calib… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 11 publications
0
1
0
Order By: Relevance
“…From the proposed taxonomy in Figure 5, the area of model optimization, based on referenced research contributions, is the one that has received the most extensive exploration. Indeed, within TinyML, we come across several cutting-edge works that explore techniques such as pruning [63], quantization [75], and knowledge distillation [88]. On the contrary, the scarcity of research focusing on HPO [93] can be attributed to its complexity, the lack of awareness about the importance of HPO and its potential to enhance the performance of ML models significantly, and the resource requirements, which can be a limiting factor for researchers with restricted access to high-performance computing infrastructure.…”
Section: Discussionmentioning
confidence: 99%
“…From the proposed taxonomy in Figure 5, the area of model optimization, based on referenced research contributions, is the one that has received the most extensive exploration. Indeed, within TinyML, we come across several cutting-edge works that explore techniques such as pruning [63], quantization [75], and knowledge distillation [88]. On the contrary, the scarcity of research focusing on HPO [93] can be attributed to its complexity, the lack of awareness about the importance of HPO and its potential to enhance the performance of ML models significantly, and the resource requirements, which can be a limiting factor for researchers with restricted access to high-performance computing infrastructure.…”
Section: Discussionmentioning
confidence: 99%