2021
DOI: 10.3390/su13020717
|View full text |Cite
|
Sign up to set email alerts
|

Early-Stage Neural Network Hardware Performance Analysis

Abstract: The demand for running NNs in embedded environments has increased significantly in recent years due to the significant success of convolutional neural network (CNN) approaches in various tasks, including image recognition and generation. The task of achieving high accuracy on resource-restricted devices, however, is still considered to be challenging, which is mainly due to the vast number of design parameters that need to be balanced. While the quantization of CNN parameters leads to a reduction of power and … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 15 publications
(5 citation statements)
references
References 50 publications
0
5
0
Order By: Relevance
“…While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. Karbachevsky et al [33] studied the impact of CNN quantization on hardware implementation of computational resources. It combines the research conducted in Baskin et al [34] to propose a computation and communication analysis for quantized CNN.…”
Section: Related Workmentioning
confidence: 99%
“…While the quantization of CNN parameters leads to a reduction of power and area, it can also generate unexpected changes in the balance between communication and computation. Karbachevsky et al [33] studied the impact of CNN quantization on hardware implementation of computational resources. It combines the research conducted in Baskin et al [34] to propose a computation and communication analysis for quantized CNN.…”
Section: Related Workmentioning
confidence: 99%
“…The topologic and hardware designs are based on multiple neuron processing and scalable computation. The neural network architecture can be implemented using a processing engine layout [ 34 ] for the hardware performance analysis framework for recognizing bottlenecks in the initial stages of a convolutional neural network (CNN). This methodology is useful for evaluating various architectures for embedded chips and associated applications like hardware accelerators.…”
Section: Related Workmentioning
confidence: 99%
“…Bit operations (BOPs) (Baskin et al, 2021) is another metric that aims to generalize floating-point operations (FLOPs) to heterogeneously quantized NNs. A hardware-aware complexity metric (HCM) (Karbachevsky et al, 2021) has also been proposed that aims to predict the impact of NN architectural decisions on the final hardware resources. Our work makes use of some of these metrics and further explores the connection and tradeoff between pruning and quantization.…”
Section: Efficiency Metricsmentioning
confidence: 99%