2022
DOI: 10.1109/tnnls.2021.3063265
|View full text |Cite
|
Sign up to set email alerts
|

Non-Structured DNN Weight Pruning—Is It Beneficial in Any Platform?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
3
2

Relationship

4
6

Authors

Journals

citations
Cited by 41 publications
(16 citation statements)
references
References 46 publications
0
16
0
Order By: Relevance
“…Therefore, it cannot efectively and eiciently leverage the hardware parallelism provided by the underlying system. Consequently, unstructured pruning is generally not compatible with GPU acceleration for DNN inference, and speed degradation can often be observed [52].…”
Section: Background and Related Work 21 Dnn Pruning: Regularity And A...mentioning
confidence: 99%
“…Therefore, it cannot efectively and eiciently leverage the hardware parallelism provided by the underlying system. Consequently, unstructured pruning is generally not compatible with GPU acceleration for DNN inference, and speed degradation can often be observed [52].…”
Section: Background and Related Work 21 Dnn Pruning: Regularity And A...mentioning
confidence: 99%
“…The majority of works in this direction apply a pretraining-pruning-retraining flow, which is not compatible with the trainingon-the-edge paradigm. According to the adopted sparsity scheme, those works can be categorized as unstructured [16,1], structured [24,2,25,26,17,3,27,28,29,30,31,18,32,33], and fine-grained structured [19,34,35,36,37,38,39,40,41] including the pattern-based and block-based ones. Detailed discussion about these sparsity schemes is provided in Appendix A.…”
Section: Sparsity Schemementioning
confidence: 99%
“…The key novelty is the fragment polarization technique that enforces the same sign for weights in each fragment. As recent works [43][44][45][46] have demonstrated, the structured pruning and quantization are two essential steps for hardware-friendly model compression that are universally applicable to all DNN accelerators. Thus, we perform structured pruning before fragment polarization considering the size of the ReRAM crossbars, and quantization after.…”
Section: Motivationmentioning
confidence: 99%