2020
DOI: 10.1609/aaai.v34i04.5954
|View full text |Cite
|
Sign up to set email alerts
|

PCONV: The Missing but Desirable Sparsity in DNN Weight Pruning for Real-Time Execution on Mobile Devices

Abstract: Model compression techniques on Deep Neural Network (DNN) have been widely acknowledged as an effective way to achieve acceleration on a variety of platforms, and DNN weight pruning is a straightforward and effective method. There are currently two mainstreams of pruning methods representing two extremes of pruning regularity: non-structured, fine-grained pruning can achieve high sparsity and accuracy, but is not hardware friendly; structured, coarse-grained pruning exploits hardware-efficient structures in pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
82
0
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 132 publications
(83 citation statements)
references
References 29 publications
0
82
0
1
Order By: Relevance
“…Chen et al proposed sparse complimentary convolution in which half of the weights with regular patterns in the original convolution kernels can be removed with little accuracy loss [8]. Ma et al proposed pattern-based kernel pruning [29]. The convolution kernels can only be pruned to one of several pre-defined patterns so that the pruned model contains some regular structures.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Chen et al proposed sparse complimentary convolution in which half of the weights with regular patterns in the original convolution kernels can be removed with little accuracy loss [8]. Ma et al proposed pattern-based kernel pruning [29]. The convolution kernels can only be pruned to one of several pre-defined patterns so that the pruned model contains some regular structures.…”
Section: Related Workmentioning
confidence: 99%
“…The performance gains are limited, due to the sparse nature of the computation. Another approach is to design more hardware-amenable pruning strategies [8,29]. For example, a hybrid strategy by combining structured and non-structured pruning can achieve good accuracy while maintaining some regular patterns in the pruned model for efficient hardware processing [29,33].…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…For example, Park et al (2017) proposed that the method of weights pruning could be used to reduce the complexity of deep neural network, so as to improve the real-time performance of deep neural network on mobile platform. In Ma et al (2020), it is proposed that weights can be quantified to reduce the computational complexity required for mobile platforms to execute deep neural network applications. What is more, a new deep neural network was proposed in Zhou et al (2019) specifically for the mobile platform to ensure the high-speed and accurate completion of the same task target and experimental results.…”
Section: Performance Optimization Of Neural Network For Mobile-cloudmentioning
confidence: 99%
“…Chen 等 [16] 提出模型张量化的概念, 将计算流图的每个节点定义为一个张量表达式, 并利 用机器学习找到张量表达式到底层程序的最佳映射. Ma 等 [17] 提出了一种新的剪枝模式, 并针对这种 具有视觉特性的卷积内核开发了一种新颖的编译器来辅助 DNN 进行推理, 实现了实时执行 PCONV 模型而不会影响精度的效果. 但是这些方法的目标是优化模型执行时间, 并不关注深度学习模型自 适应.…”
unclassified