Pruning filters with L1-norm and standard deviation for CNN compression

Sun, Xinlu; Zhou, Dianle; Pan, Xiaotian; Zhong, Zhang; Wang, Fei

doi:10.1117/12.2523246

Cited by 7 publications

(6 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These methods can be categorized into two groups: those that analyze the filters themselves and those that perform pruning based on filter activations. Ranking and pruning based on filter analysis is performed using various measures including L1 norm [4], L1-norm and standard deviation [5], Entropy [6], Geometric Median [7] and average percentage of zero activations [8]. A new optimization method that enforces correlation among filters and then safely removes the redundant filters has also been proposed [9].…”

Section: Offline Pruningmentioning

confidence: 99%

“…Although these methods pruned away a large number of weights from the network but they had a major shortcoming of introducing sparsity. More recent literature addressed this concern first by pruning filters as convolution operations constitutes the main computational burden of a CNN and second by using sparsity regularizers [4,5,6,7,8,9,17,18,21,26,49].…”

Section: Introductionmentioning

confidence: 99%

“…There are two major methodologies in neural net compression i) structured pruning and knowledge distillation. The existing work on structured pruning [4,5,6,7,8,9,17,18,21,26,49] addresses only the width of the layer by removing filters based on their importance. However, selecting what to prune from large CNNs is an NP-Hard problem, to find the optimal solution one would need to rank each filter by turning it off and perform inference using all the samples.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Comprehensive Online Network Pruning via Learnable Scaling Factors

Haider¹,

Taj²

2020

Preprint

View full text Add to dashboard Cite

One of the major challenges in deploying deep neural network architectures is their size which has an adverse effect on their inference time and memory requirements. Deep CNNs can either be pruned width-wise by removing filters based on their importance or depth-wise by removing layers and blocks. Width wise pruning (filter pruning) is commonly performed via learnable gates or switches and sparsity regularizers whereas pruning of layers has so far been performed arbitrarily by manually designing a smaller network usually referred to as a student network. We propose a comprehensive pruning strategy that can perform both width-wise as well as depth-wise pruning. This is achieved by introducing gates at different granularities (neuron, filter, layer, block) which are then controlled via an objective function that simultaneously performs pruning at different granularity during each forward pass. Our approach is applicable to wide-variety of architectures without any constraints on spatial dimensions or connection type (sequential, residual, parallel or inception). Our method has resulted in a compression ratio of 70% to 90% without noticeable loss in accuracy when evaluated on benchmark datasets.

show abstract

Section: Offline Pruningmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Comprehensive Online Network Pruning via Learnable Scaling Factors

Haider¹,

Taj²

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Offline Pruning can either be via filter analysis or it can be based on filter activations. Pruning via filter analysis is performed using measures such as L1 norm [4], L1-norm & standard deviation [5], Entropy [6], Geometric Median [7], average percentage of zero activations [8] and correlation among filters [9]. Filter activation maps inform us about whether the certain filter fires on a given dataset.…”

Section: Introductionmentioning

confidence: 99%

“…The existing work on structured pruning [4,5,6,7,8,9,16,17,20,25,48,46,45] mostly addresses only the width of the layer by removing filters. The depth of the network is usually trimmed by removing several layers and filters from the network in an adhoc manners (for e.g.…”

Section: Introductionmentioning

confidence: 99%

Comprehensive Online Network Pruning Via Learnable Scaling Factors

Haider

Taj

2021

2021 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

One of the major challenges in deploying deep neural network architectures is their size which has an adverse effect on their inference time and memory requirements. Deep CNNs can either be pruned width-wise by removing filters or depth-wise by removing layers and blocks. Width wise pruning (filter pruning) is commonly performed via learnable gates or switches and sparsity regularizers whereas pruning of layers has so far been performed arbitrarily by manually designing a smaller network usually referred to as a student network. We propose a comprehensive pruning strategy that can perform both width-wise as well as depth-wise pruning. This is achieved by introducing gates at different granularities (neuron, filter, layer, block) which are then controlled via an objective function that simultaneously performs pruning at different granularity during each forward pass. Our approach is applicable to wide-variety of architectures without any constraints on spatial dimensions or connection type (sequential, residual, parallel or inception). Our method has resulted in a compression ratio of 70% to 90% without noticeable loss in accuracy when evaluated on benchmark datasets.

show abstract