2019 IEEE Winter Conference on Applications of Computer Vision (WACV) 2019
DOI: 10.1109/wacv.2019.00145
|View full text |Cite
|
Sign up to set email alerts
|

Multi-Layer Pruning Framework for Compressing Single Shot MultiBox Detector

Abstract: We propose a framework for compressing state-of-the-art Single Shot MultiBox Detector (SSD). The framework addresses compression in the following stages: Sparsity Induction, Filter Selection, and Filter Pruning. In the Sparsity Induction stage, the object detector model is sparsified via an improved global threshold. In Filter Selection & Pruning stage, we select and remove filters using sparsity statistics of filter weights in two consecutive convolutional layers. This results in the model with the size small… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 20 publications
(18 citation statements)
references
References 18 publications
0
18
0
Order By: Relevance
“…The results of various pruning techniques in terms of accuracy, compression rate and floating-point operations (FLOPs) are represented in Table 4. Our model out-performs the training speed of the model created by multi-layer pruning framework [41], as the latter implements three stages of pruning, only after completely training the network from the scratch. Hence the training time increases many folds in this method, whereas the former (our model) trains the network initially based on OATM method which reduces the training time drastically.…”
Section: Resultsmentioning
confidence: 99%
“…The results of various pruning techniques in terms of accuracy, compression rate and floating-point operations (FLOPs) are represented in Table 4. Our model out-performs the training speed of the model created by multi-layer pruning framework [41], as the latter implements three stages of pruning, only after completely training the network from the scratch. Hence the training time increases many folds in this method, whereas the former (our model) trains the network initially based on OATM method which reduces the training time drastically.…”
Section: Resultsmentioning
confidence: 99%
“…Refs. [30][31][32][33][34][35][36] prune filters that have a minimal contribution in the model. After removing these filters, the model is usually fine-tuned to maintain its performance.…”
Section: Model Compressionmentioning
confidence: 99%
“…Model compression is considered as another reliable and economic method to improve the efficiency of the convolutional neural network, which can be roughly divided into three categories: (a) Connection pruning [28,29]; (b) Filter pruning [30][31][32][33][34][35][36]; and (c) Quantization [28,[37][38][39]. These methods can effectively reduce the computation of the convolutional neural network, but this is always achieved at the price of sacrificing the accuracy.…”
Section: Introductionmentioning
confidence: 99%
“…Many common models also use these convolutions to explore new architectures that can reduce FLOPS. Another common method to improve model efficiency is compression model, which can be roughly divided into three categories: connection pruning, filter pruning [1][2][3] and quantization [4]. There are two different goals for using efficient convolution filters.…”
Section: Convolutional Neural Networkmentioning
confidence: 99%