Optimization based Layer-wise Magnitude-based Pruning for DNN Compression

Li, Guiying; Qian, Chao; Jiang, Chunhui; Lu, Xiaofen; Tang, Ke

doi:10.24963/ijcai.2018/330

Cited by 39 publications

(18 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Both algorithms are compared on the 20 problems and the averaged final solution qualities over 20 runs are shown in Fig.5. On problems 1,4,7,8,9,12,13,14,17,18,19,20, NPDC outperforms significantly than NPDC-random. For the rest problems, NPDC only shows slightly advantages over NPDC-random.…”

Section: Investigations On the Meta-model Of Npdcmentioning

confidence: 97%

“…Evolutionary Algorithms (EAs), which work by searching the solution space of the targeted problem iteratively and in a randomized way, have shown powerful performance in solving many real-world optimization problems [5], [6], [7], [8], [9]. Unfortunately, the search-based core makes EAs ineffective and inefficient for solving large-scale optimization problems for two reasons: 1) As the number of decision variables increases, the solution space of the problem enlarges exponentially, preventing EAs exploring effectively within reasonable amount of search iterations.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A Parallel Divide-and-Conquer-Based Evolutionary Algorithm for Large-Scale Optimization

2019

Self Cite

View full text Add to dashboard Cite

Large-scale optimization problems that involve thousands of decision variables have extensively arisen from various industrial areas. As a powerful optimization tool for many real-world applications, evolutionary algorithms (EAs) fail to solve the emerging large-scale problems both effectively and efficiently. In this paper, we propose a novel Divide-and-Conquer (DC) based EA that can not only produce high-quality solution by solving sub-problems separately, but also highly utilizes the power of parallel computing by solving the sub-problems simultaneously. Existing DC-based EAs that were deemed to enjoy the same advantages of the proposed algorithm, are shown to be practically incompatible with the parallel computing scheme, unless some trade-offs are made by compromising the solution quality.

show abstract

Section: Investigations On the Meta-model Of Npdcmentioning

confidence: 97%

Section: Introductionmentioning

confidence: 99%

A Parallel Divide-and-Conquer-Based Evolutionary Algorithm for Large-Scale Optimization

2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…[7] proposed the dynamic network surgery to recover mistaken parameter. [23] proposed an optimization algorithm to automatically tune the pruning thresholds for magnitude-based pruning methods. However, weight pruning methods always lead to unstructured models, so the model cannot leverage the existing efficient BLAS libraries in practice.…”

Section: Related Work 21 Weight Pruningmentioning

confidence: 99%

A Unified-Model via Block Coordinate Descent for Learning the Importance of Filter

Zhang

et al. 2021

Proceedings of the 2021 International Conference on Multimedia Retrieval

View full text Add to dashboard Cite

Deep Convolutional Neural Networks (CNNs) are increasingly used in multimedia retrieval, and accelerating Deep CNNs has recently received an ever-increasing research focus. Among various approaches proposed in the literature, filter pruning has been regarded as a promising solution, which is due to its advantage in significant speedup and memory reduction of both network model and intermediate feature maps. Many works have been proposed to find unimportant filters, and then prune it for accelerating Deep CNNs. However, they mainly focus on using heuristic methods to evaluate the importance of filters, such as the statistical information of filters (e.g., prune filter with small ℓ 2 -norm), which may be not perfect. In this paper, we propose a novel filter pruning method, namely A Unified-Model via Block Coordinate Descent for Learning the Importance of Filter (U-BCD). The importance of the filters in our U-BCD is learned by optimizing method. We can simultaneously learn the filter parameters and the importance of filters by block coordinate descent method. When applied to two image classification benchmarks, the effectiveness of our U-BCD is validated. Notably, on CIFAR-10, our U-BCD reduces more than 57% FLOPs on ResNet-110 with even 0.08% relative accuracy improvement, and also achieve state-of-the-art results on ILSVRC-2012. CCS CONCEPTS• Information systems → Image search; Speech / audio search; • Computing methodologies → Neural networks; • Mathematics of computing → Network optimization.

show abstract

“…This method employs an optimization approach to tune the pruning threshold automatically. The pruning threshold is used to select the most important connections from all layers [20]. Another pruning approach consists in selecting the most discriminative channels based on additional discrimination-aware losses [21].…”

Section: Previous Workmentioning

confidence: 99%

Multilevel Neural Network for Reducing Expected Inference Time

Putra

Leu

2019

IEEE Access

View full text Add to dashboard Cite

It is widely known that deep neural networks (DNNs) can perform well in many applications, and can sometimes exceed human ability. However, their cost limits their impact in a variety of realworld applications, such as IoT and mobile computing. Recently, many DNN compression and acceleration methods have been employed to overcome this problem. Most methods succeed in reducing the number of parameters and FLOPs, but only a few can speed up expected inference times because of either the overhead generated from using such methods or DNN framework deficiencies. Edge-cloud computing has recently emerged and presents an opportunity for new model acceleration and compression techniques. To address the aforementioned problem, we propose a novel technique to speed up expected inference times by using several networks that perform the exact same task with different strengths. Although our method is based on edge-cloud computing, it is suitable for any other hierarchical computing paradigm. Using a simple yet strong enough estimator, the system predicts whether the data should be passed to a larger network or not. Extensive experimental results demonstrate that the proposed technique can speed up expected inference times and beat almost all state-of-the-art compression techniques, including pruning, low-rank approximation, knowledge distillation, and branchy-type networks, on both CPUs and GPUs. INDEX TERMS Edge computing, mobile computing, network compression and acceleration.

show abstract

Optimization based Layer-wise Magnitude-based Pruning for DNN Compression

Cited by 39 publications

References 11 publications

A Parallel Divide-and-Conquer-Based Evolutionary Algorithm for Large-Scale Optimization

A Parallel Divide-and-Conquer-Based Evolutionary Algorithm for Large-Scale Optimization

A Unified-Model via Block Coordinate Descent for Learning the Importance of Filter

Multilevel Neural Network for Reducing Expected Inference Time

Contact Info

Product

Resources

About