2020 57th ACM/IEEE Design Automation Conference (DAC) 2020
DOI: 10.1109/dac18072.2020.9218534
|View full text |Cite
|
Sign up to set email alerts
|

BitPruner: Network Pruning for Bit-serial Accelerators

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 19 publications
(3 citation statements)
references
References 10 publications
0
3
0
Order By: Relevance
“…In terms of granularity, accelerators can exploit bit-wise sparsity via bit-serial computation [1,31], unstructured element-wise sparsity of either activations or weights [2,5,6,8,11,20,29,38], or structured sparsity via a co-designed pruning algorithm [17,37,41]. BitPruner [39] applies structured bit-wise pruning to benefit bit-serial architectures. Our approach also falls under the structured pruning category, but with one key distinction: the pruning framework is closely designed with the dataflow.…”
Section: Related Workmentioning
confidence: 99%
“…In terms of granularity, accelerators can exploit bit-wise sparsity via bit-serial computation [1,31], unstructured element-wise sparsity of either activations or weights [2,5,6,8,11,20,29,38], or structured sparsity via a co-designed pruning algorithm [17,37,41]. BitPruner [39] applies structured bit-wise pruning to benefit bit-serial architectures. Our approach also falls under the structured pruning category, but with one key distinction: the pruning framework is closely designed with the dataflow.…”
Section: Related Workmentioning
confidence: 99%
“…[1]. While training optimizations for such architectures have recently been proposed, they do not fully solve the scheduling issues [16].…”
Section: Shared Weight Bit-sparsitymentioning
confidence: 99%
“…Deep learning models especially neural networks are known to be fault-tolerant inherently mainly because of the widely utilized activation functions, pooling layers, and the rankingbased outputs that are usually insensitive to computing variations. Many prior work explored the inherent fault tolerance of neural networks for the sake of higher energy efficiency, performance, and memory footprint with approaches like voltage scaling [12] [13], DRAM refresh scaling [14], and low-bit-width quantization [15] [16]. However, the unique fault-tolerant feature does not guarantee fault tolerance against hardware faults and even results in substantial accuracy variation across the different fault configurations according to the investigation in [17] [18] [19], which essentially aggravates the uncertainty of the deep learning processing and hinders the deployment of deep learning in safety-critical applications.…”
Section: Introductionmentioning
confidence: 99%