2021
DOI: 10.3390/s21124195
|View full text |Cite
|
Sign up to set email alerts
|

An Approximate GEMM Unit for Energy-Efficient Object Detection

Abstract: Edge computing brings artificial intelligence algorithms and graphics processing units closer to data sources, making autonomy and energy-efficient processing vital for their design. Approximate computing has emerged as a popular strategy for energy-efficient circuit design, where the challenge is to achieve the best tradeoff between design efficiency and accuracy. The essential operation in artificial intelligence algorithms is the general matrix multiplication (GEMM) operation comprised of matrix multiplicat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
1
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 91 publications
0
4
0
Order By: Relevance
“…As future work, we plan to improve the algorithm to implement it on heterogeneous multicore CPU / GPU architectures, as done in [ 36 ], and to optimize the portable design of memory accesses to avoid unwanted overhead on Jetson boards with low CUDA capabilities. Moreover, thanks to the introduction of Volta architecture on the recent Nvidia Tegra series, the availability of tensor cores opens up to new algorithmic designs [ 37 ]. Those devices deliver half-precision GEMM (General Matrix Multiply) in one clock cycle, consuming low-energy in edge context.…”
Section: Discussionmentioning
confidence: 99%
“…As future work, we plan to improve the algorithm to implement it on heterogeneous multicore CPU / GPU architectures, as done in [ 36 ], and to optimize the portable design of memory accesses to avoid unwanted overhead on Jetson boards with low CUDA capabilities. Moreover, thanks to the introduction of Volta architecture on the recent Nvidia Tegra series, the availability of tensor cores opens up to new algorithmic designs [ 37 ]. Those devices deliver half-precision GEMM (General Matrix Multiply) in one clock cycle, consuming low-energy in edge context.…”
Section: Discussionmentioning
confidence: 99%
“…The core element of the ACTA represents a dedicated Approximate General matrix multiply hardware Unit (AGU) whose accuracy can be changed on the fly. The envisioned AGU is based on the iterative logarithmic product approximation proposed by Babić et al [13] and the design of an approximate GEMM unit presented by Pilipović et al [14]. The AGU does not duplicate the functionality in multiple accuracy versions but incorporates a simple logic to approximate addition and multiplication constituting the GEMM operation.…”
Section: Overview Of the Acurracy Tunable Accelerator (Acta) Platformmentioning
confidence: 99%
“…There have been several attempts to use approximate integer multipliers in neural network learning [12]- [14]. The authors of these studies report that the learning was successful, but they mainly worked with tiny neural networks.…”
Section: Introductionmentioning
confidence: 99%