2018 IEEE Symposium on VLSI Circuits 2018
DOI: 10.1109/vlsic.2018.8502404
|View full text |Cite
|
Sign up to set email alerts
|

Sticker: A 0.41-62.1 TOPS/W 8Bit Neural Network Processor with Multi-Sparsity Compatible Convolution Arrays and Online Tuning Acceleration for Fully Connected Layers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 99 publications
(24 citation statements)
references
References 0 publications
0
24
0
Order By: Relevance
“…The SCNN [37] achieved a performance improvement of 2.7× with Cartesian product-based processing. STICKER [51] applied two-way set associative PEs to the SCNN architecture, thereby reducing the memory area to 92%. Wang, et al [45] improved the hash-based accelerator for SCNN using a load-balancing algorithm, and SpAW [29] proposed a dualindexing module to utilize sparsity more efficiently.…”
Section: Zero-weight and -Activation Skipping Architecturementioning
confidence: 99%
See 1 more Smart Citation
“…The SCNN [37] achieved a performance improvement of 2.7× with Cartesian product-based processing. STICKER [51] applied two-way set associative PEs to the SCNN architecture, thereby reducing the memory area to 92%. Wang, et al [45] improved the hash-based accelerator for SCNN using a load-balancing algorithm, and SpAW [29] proposed a dualindexing module to utilize sparsity more efficiently.…”
Section: Zero-weight and -Activation Skipping Architecturementioning
confidence: 99%
“…Cambricon-X [52] allowed different PEs to load new data from the memory asynchronously to improve the overall efficiency. SCNN [37] used Cartesian productbased processing with many memory banks for sparse convolutional layers, and STICKER [51] applied two-way set associative PEs to SCNN architecture to reduce the memory area to 92%. To deal with the load imbalance problem, some accelerators introduced very wide memory and multiplexers (MUXs), becoming more complex.…”
Section: Introductionmentioning
confidence: 99%
“…4-(b) shows the structure of the backend 8-bit asynchronous SAR ADC where a binary-scaled capacitor DAC (CDAC) is implemented using unit capacitor C0=20fF. Unlike the top-plate sample scheme used in [2], a bottom-plate sampling scheme is used for the CDAC, which has the advantage of being immune to the input full-scale range reduction due to the parasitic capacitance of the CDAC. This characteristic is critical because any gain error in A/D conversion will hurt overall MAC operation accuracy.…”
Section: Series Unit Capmentioning
confidence: 99%
“…The SCNN [18] achieves a performance improvement of 2.7× with Cartesian product-based processing. STICKER [19] applied two-way set associative PEs to SCNN architecture, thereby reducing the memory area to 92%. Wang, et al [20] improved the hash-based accelerator for SCNN using a load-balancing algorithm, and SpAW [21] proposed a dual-indexing module to utilize the sparsity more efficiently.…”
Section: Zero-weight and -Activation Skipping Architecturementioning
confidence: 99%