Neural Network Compression Framework for Fast Model Inference

Kozlov, Alexander; Lazarevich, Ivan; Shamporov, Vasily; Lyalyushkin, Nikolay; Gorbachev, Yury

doi:10.1007/978-3-030-80129-8_17

Cited by 15 publications

(3 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Optimization. To achieve faster inference and throughput via quantization and optimization, the library utilizes Open-VINO [5] and Neural Network Compression Framework (NNCF) [20].…”

Section: Deploymentioning

confidence: 99%

Anomalib: A Deep Learning Library for Anomaly Detection

Akçay

Ameln

Vaidya

et al. 2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

This paper introduces anomalib 1 , a novel library for unsupervised anomaly detection and localization. With reproducibility and modularity in mind, this open-source library provides algorithms from the literature and a set of tools to design custom anomaly detection algorithms via a plug-andplay approach. Anomalib comprises state-of-the-art anomaly detection algorithms that achieve top performance on the benchmarks and that can be used off-the-shelf. In addition, the library provides components to design custom algorithms that could be tailored towards specific needs. Additional tools, including experiment trackers, visualizers, and hyperparameter optimizers, make it simple to design and implement anomaly detection models. The library also supports OpenVINO model-optimization and quantization for realtime deployment. Overall, anomalib is an extensive library for the design, implementation, and deployment of unsupervised anomaly detection models from data to the edge.

show abstract

“…Optimization. To achieve faster inference and throughput via quantization and optimization, the library utilizes Open-VINO [5] and Neural Network Compression Framework (NNCF) [20].…”

Section: Deploymentioning

confidence: 99%

Anomalib: A Deep Learning Library for Anomaly Detection

Akçay

Ameln

Vaidya

et al. 2022

2022 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

show abstract

“…convert them to integer weights) and results in a quantised CNN library in CheckINN . Quantisation is a common technique in machine learning and NN verification: quantised neural networks take less computational resources to run, are more amenable to verification, and often can be trained to be as accurate as floating point networks [11,25,26]. Modulo this hurdle, verification of the CNNs goes in a straightforward way, and requires just one line of code.…”

Section: Reachability and Symbolic Executionmentioning

confidence: 99%

CheckINN: Wide Range Neural Network Verification in Imandra

Remi¹,

Passmore²,

Komendantskaya³

et al. 2022

Preprint

View full text Add to dashboard Cite

Neural networks are increasingly relied upon as components of complex safety-critical systems such as autonomous vehicles. There is high demand for tools and methods that embed neural network verification in a larger verification cycle. However, neural network verification is difficult due to a wide range of verification properties of interest, each typically only amenable to verification in specialised solvers. In this paper, we show how Imandra, a functional programming language and a theorem prover originally designed for verification, validation and simulation of financial infrastructure can offer a holistic infrastructure for neural network verification. We develop a novel library CheckINN that formalises neural networks in Imandra, and covers different important facets of neural network verification.

show abstract

“…In a paper from Kozlov, et al [4], the authors discuss Neural Network Compression Framework (NNCF) for fast model inference. This framework includes the quantization and pruning methods for model compression.…”

Section: Related Workmentioning

confidence: 99%

Model Compression

Ishtiaq,

Mahmood,

Anees

et al. 2021

Preprint

View full text Add to dashboard Cite

With time, machine learning models have increased in their scope, functionality and size. Consequently, the increased functionality and size of such models requires high-end hardware to both train and provide inference after the fact. This paper aims to explore the possibilities within the domain of model compression and discuss the efficiency of each of the possible approaches while comparing model size and performance with respect to pre-and post-compression.

show abstract

Neural Network Compression Framework for Fast Model Inference

Cited by 15 publications

References 9 publications

Anomalib: A Deep Learning Library for Anomaly Detection

Anomalib: A Deep Learning Library for Anomaly Detection

CheckINN: Wide Range Neural Network Verification in Imandra

Model Compression

Contact Info

Product

Resources

About