2019 29th International Conference on Field Programmable Logic and Applications (FPL) 2019
DOI: 10.1109/fpl.2019.00063
|View full text |Cite
|
Sign up to set email alerts
|

Reducing Dynamic Power in Streaming CNN Hardware Accelerators by Exploiting Computational Redundancies

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…The efficiency of the Skipping approximation relies on how often a computation can be skipped, the complexity of the conditional prediction, as well as the complexity of the skipped operation. Piyasena et al [94] leverages the widely used ReLu activation function to eliminate redundant computations. [94] estimates the sign of the convolution output using a low-cost prediction scheme.…”
Section: Skippingmentioning
confidence: 99%
See 2 more Smart Citations
“…The efficiency of the Skipping approximation relies on how often a computation can be skipped, the complexity of the conditional prediction, as well as the complexity of the skipped operation. Piyasena et al [94] leverages the widely used ReLu activation function to eliminate redundant computations. [94] estimates the sign of the convolution output using a low-cost prediction scheme.…”
Section: Skippingmentioning
confidence: 99%
“…Piyasena et al [94] leverages the widely used ReLu activation function to eliminate redundant computations. [94] estimates the sign of the convolution output using a low-cost prediction scheme. In this scheme, a power-of-two weight quantization is applied so that multiplications can be replaced with simple logic shifters.…”
Section: Skippingmentioning
confidence: 99%
See 1 more Smart Citation
“…There are a number of hardware architectures found in the literature that aim to provide acceleration for CNN applications while reducing computational redundancies [ 14 , 15 , 16 , 17 ]. And, there are some approaches that attempt to exploit the high bandwidth available near the sensor interface by bringing the computation closer to the image sensor [ 7 ].…”
Section: Related Workmentioning
confidence: 99%
“…Many real world applications such as robotics, self-driving cars, augmented reality, video surveillance, mobile-apps and smart city application [38]- [40] require IoT devices capable of AI inference. Thus, DNN inference has also been demonstrated on various embedded System-onChips (SoC) such as Nvidia Tegra, Samsung Exynos, as well as application specific FPGA designs (ESE [34], SCNN [41], [42], [43]), and ASICs such as GoogleTPU and Movidius-NCS, which is used later in our experiment. Except FPGAs, most of these devices are generalized to work with majority of DNN architectures.…”
Section: Introductionmentioning
confidence: 99%