2021
DOI: 10.48550/arxiv.2104.03438
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Convolutional Neural Network Pruning with Structural Redundancy Reduction

Abstract: Convolutional neural network (CNN) pruning has become one of the most successful network compression approaches in recent years. Existing works on network pruning usually focus on removing the least important filters in the network to achieve compact architectures. In this study, we claim that identifying structural redundancy plays a more essential role than finding unimportant filters, theoretically and empirically. We first statistically model the network pruning problem in a redundancy reduction perspectiv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 49 publications
0
1
0
Order By: Relevance
“…Training compact deep neural networks (DNNs) (Howard et al, 2017) efficiently has become an appealing topic be-cause of the increasing demand for deploying DNNs on resource-limited devices such as mobile phones and drones (Moskalenko et al, 2018). Recently, a large number of approaches have been proposed for training lightweight DNNs with the help of a cumbersome, over-parameterized model, such as network pruning (Li et al, 2016;He et al, 2019;Wang et al, 2021), quantization (Han et al, 2015), factorization (Jaderberg et al, 2014), and knowledge distillation (KD) (Hinton et al, 2015;Phuong & Lampert, 2019;Jin et al, 2020;Yun et al, 2020;Passalis et al, 2020;Wang, 2021). Among all these approaches, knowledge distillation is a popular scheme with which a compact student network is trained by mimicking the softmax output (class probabilities) of a pre-trained deeper and wider teacher model (Hinton et al, 2015).…”
Section: Introductionmentioning
confidence: 99%
“…Training compact deep neural networks (DNNs) (Howard et al, 2017) efficiently has become an appealing topic be-cause of the increasing demand for deploying DNNs on resource-limited devices such as mobile phones and drones (Moskalenko et al, 2018). Recently, a large number of approaches have been proposed for training lightweight DNNs with the help of a cumbersome, over-parameterized model, such as network pruning (Li et al, 2016;He et al, 2019;Wang et al, 2021), quantization (Han et al, 2015), factorization (Jaderberg et al, 2014), and knowledge distillation (KD) (Hinton et al, 2015;Phuong & Lampert, 2019;Jin et al, 2020;Yun et al, 2020;Passalis et al, 2020;Wang, 2021). Among all these approaches, knowledge distillation is a popular scheme with which a compact student network is trained by mimicking the softmax output (class probabilities) of a pre-trained deeper and wider teacher model (Hinton et al, 2015).…”
Section: Introductionmentioning
confidence: 99%