2021
DOI: 10.1007/978-3-030-80129-8_17
|View full text |Cite
|
Sign up to set email alerts
|

Neural Network Compression Framework for Fast Model Inference

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…Optimization. To achieve faster inference and throughput via quantization and optimization, the library utilizes Open-VINO [5] and Neural Network Compression Framework (NNCF) [20].…”
Section: Deploymentioning
confidence: 99%
“…Optimization. To achieve faster inference and throughput via quantization and optimization, the library utilizes Open-VINO [5] and Neural Network Compression Framework (NNCF) [20].…”
Section: Deploymentioning
confidence: 99%
“…convert them to integer weights) and results in a quantised CNN library in CheckINN . Quantisation is a common technique in machine learning and NN verification: quantised neural networks take less computational resources to run, are more amenable to verification, and often can be trained to be as accurate as floating point networks [11,25,26]. Modulo this hurdle, verification of the CNNs goes in a straightforward way, and requires just one line of code.…”
Section: Reachability and Symbolic Executionmentioning
confidence: 99%
“…In a paper from Kozlov, et al [4], the authors discuss Neural Network Compression Framework (NNCF) for fast model inference. This framework includes the quantization and pruning methods for model compression.…”
Section: Related Workmentioning
confidence: 99%

Model Compression

Ishtiaq,
Mahmood,
Anees
et al. 2021
Preprint