2018 IEEE International Solid - State Circuits Conference - (ISSCC) 2018
DOI: 10.1109/isscc.2018.8310262
|View full text |Cite
|
Sign up to set email alerts
|

UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
107
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 268 publications
(111 citation statements)
references
References 4 publications
1
107
0
Order By: Relevance
“…ASICs [27], [28], [33], [ Helium, an ISA extension tailored for DSP-oriented workloads, such as an inference task. However, such an extension is not supported yet by any device.…”
Section: Performance Energy Efficiency Power Budget Flexibilitymentioning
confidence: 99%
“…ASICs [27], [28], [33], [ Helium, an ISA extension tailored for DSP-oriented workloads, such as an inference task. However, such an extension is not supported yet by any device.…”
Section: Performance Energy Efficiency Power Budget Flexibilitymentioning
confidence: 99%
“…Both implementations in [22] and [20] have a higher power efficiency than Nullhop, but provide consistently lower performances (<350 GOp/s) using more MAC units. They also require a larger area (16 mm 2 ), but this is justified by their support for Recurrent Neural Networks and variable bit precision.…”
Section: Memory Power Consumption Estimationmentioning
confidence: 99%
“…State-of-the-art silicon prototypes such as QUEST [43] or UNPU [44] are exploiting such strong quantization and voltage scaling and have been able to measure such high energy efficiency with their devices. The UNPU reaches an energy efficiency of 50.6 TOp/s/W at a throughput of 184 GOp/s with 1bit weights and 16-bit activations on 16 mm 2 of silicon in 65 nm technology.…”
Section: Fpga and Asic Acceleratorsmentioning
confidence: 99%
“…Hyperdrive not only exploits the advantages of reduced weight memory requirements and computational complexity, but fundamentally differs from previous BWN accelerators [26,44,45]. The main concepts can be summarized as: 1) Feature Maps are stored entirely on-chip, instead the weights are streamed to the chip (i.e., feature map stationary).…”
Section: Fpga and Asic Acceleratorsmentioning
confidence: 99%