2019 IEEE/ACM International Symposium on Nanoscale Architectures (NANOARCH) 2019
DOI: 10.1109/nanoarch47378.2019.181304
|View full text |Cite
|
Sign up to set email alerts
|

ResNet Can Be Pruned 60×: Introducing Network Purification and Unused Path Removal (P-RM) after Weight Pruning

Abstract: The state-of-art DNN structures involve high computation and great demand for memory storage which pose intensive challenge on DNN framework resources. To mitigate the challenges, weight pruning techniques has been studied. However, high accuracy solution for extreme structured pruning that combines different types of structured sparsity still waiting for unraveling due to the extremely reduced weights in DNN networks. In this paper, we propose a DNN framework which combines two different types of structured w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
19
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
3

Relationship

3
7

Authors

Journals

citations
Cited by 31 publications
(20 citation statements)
references
References 12 publications
0
19
0
Order By: Relevance
“…The key novelty is the fragment polarization technique that enforces the same sign for weights in each fragment. As recent works [43][44][45][46] have demonstrated, the structured pruning and quantization are two essential steps for hardware-friendly model compression that are universally applicable to all DNN accelerators. Thus, we perform structured pruning before fragment polarization considering the size of the ReRAM crossbars, and quantization after.…”
Section: Motivationmentioning
confidence: 99%
“…The key novelty is the fragment polarization technique that enforces the same sign for weights in each fragment. As recent works [43][44][45][46] have demonstrated, the structured pruning and quantization are two essential steps for hardware-friendly model compression that are universally applicable to all DNN accelerators. Thus, we perform structured pruning before fragment polarization considering the size of the ReRAM crossbars, and quantization after.…”
Section: Motivationmentioning
confidence: 99%
“…To run such models on MCUs requires extreme minimization of the model's size and computation requirements with minimum loss of accuracy. Recent studies have applied model compression techniques such as pruning connections and neurons from fully connected neural networks (FCNNs) [17], filter or channel pruning from convolutional neural networks (CNNs) [18]- [20], knowledge distillation [21] and low-precision quantization [17], [22]. VOLUME 4, 2016 The most recent advancement in extreme downsizing of DL models and their computation requirements is XNOR-Net [23], where a model's activations and inputs are fully binarized.…”
Section: Introductionmentioning
confidence: 99%
“…From the pruning algorithm aspect, heuristic-based pruning was first proposed in [23] and gets improvements with more sophisticated designed heuristics [19,27,36,49,74,87]. Regularizationbased pruning [21,26,39,41,43,55,56,62,69,76,77,81], on the other hand, are more mathematicsoriented. Recent works [39,51,62,81,82] achieve substantial weight reduction without hurting the accuracy by leveraging Alternating Direction Methods of Multipliers (ADMM) with dynamic regularization penalties, but these methods require the manual setting of the compression rate for each layer.…”
Section: Introductionmentioning
confidence: 99%