2021
DOI: 10.1109/ojcas.2021.3083332
|View full text |Cite
|
Sign up to set email alerts
|

FantastIC4: A Hardware-Software Co-Design Approach for Efficiently Running 4Bit-Compact Multilayer Perceptrons

Abstract: With the growing demand for deploying Deep Learning models to the "edge", it is paramount to develop techniques that allow to execute state-of-the-art models within very tight and limited resource constraints. In this work we propose a software-hardware optimization paradigm for obtaining a highly efficient execution engine of deep neural networks (DNNs) that are based on fully-connected layers. The work's approach is centred around compression as a means for reducing the area as well as power requirements of,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 36 publications
0
4
0
Order By: Relevance
“…Firstly, we consider the MLP network architecture [30] applied to an image classification task and investigate how the quantization of weights affects performance of the network measured by classification accuracy. Specifically, MLP is still attractive and is applied in solving different challenges occurring in different research areas, e.g., [30][31][32][33][34], and, hence, it is worth investigating. Further, the results from the aspect of SQNR will also be analyzed by checking the agreement between the theoretically and experimentally obtained values.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Firstly, we consider the MLP network architecture [30] applied to an image classification task and investigate how the quantization of weights affects performance of the network measured by classification accuracy. Specifically, MLP is still attractive and is applied in solving different challenges occurring in different research areas, e.g., [30][31][32][33][34], and, hence, it is worth investigating. Further, the results from the aspect of SQNR will also be analyzed by checking the agreement between the theoretically and experimentally obtained values.…”
Section: Resultsmentioning
confidence: 99%
“…The first neural network model adopted in this paper is multi-layer perceptron (MLP) [30], which represents a kind of simple feedforward artificial neural network. Although it can be considered as a classical model and it is succeeded by the convolutional neural network (CNN) in advanced computer vision applications, its simplicity can be exploited in edge computing devices for real-time classification tasks [31][32][33][34]. We also employ a simple CNN network [30] for analysis, and both networks are used for image classification.…”
Section: Introductionmentioning
confidence: 99%
“…As discussed in [47], and experimentally shown in [48], lowering the entropy of DNN weights provides benefits in terms of memory as well as computational complexity. The Entropy-Constrained Quantization (ECQ) algorithm is a clustering algorithm that also takes the entropy of the weight distributions into account.…”
Section: Entropy-constrained Quantizationmentioning
confidence: 90%
“…Since weights are usually normally distributed around zero, the entropy term also strongly encourages sparsity. In practice, this quantization scheme works well rendering sparse and low-bit neural networks for various machine learning tasks and network architectures [48,27,46].…”
Section: Explainability-driven Entropy-constrained Quantizationmentioning
confidence: 97%