Data-free parameter pruning for Deep Neural Networks

Srinivas, S.; Babu, R. Venkatesh

doi:10.48550/arxiv.1507.06149

Cited by 71 publications

(87 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our core pruning technique is still unstructured, magnitude pruning (among many other pruning techniques, e.g., Hu et al (2016); Srinivas and Babu (2015); Dong et al (2017); Li et al (2016); Luo et al (2017); He et al (2017)). Unstructured pruning does not necessarily yield networks that execute more quickly with commodity hardware or libraries; we aim to convey insight on neural network behavior rather than suggest immediate opportunities to improve performance.…”

Section: Rewinding On Deep Network For Imagenetmentioning

confidence: 99%

Stabilizing the Lottery Ticket Hypothesis

Frankle,

Dziugaite,

Roy

et al. 2019

Preprint

115

View full text Add to dashboard Cite

Pruning is a well-established technique for removing unnecessary structure from neural networks after training to improve the performance of inference. Several recent results have explored the possibility of pruning at initialization time to provide similar benefits during training. In particular, the lottery ticket hypothesis conjectures that typical neural networks contain small subnetworks that can train to similar accuracy in a commensurate number of steps. The evidence for this claim is that a procedure based on iterative magnitude pruning (IMP) reliably finds such subnetworks retroactively on small vision tasks. However, IMP fails on deeper networks, and proposed methods to prune before training or train pruned networks encounter similar scaling limitations. In this paper, we argue that these efforts have struggled on deeper networks because they have focused on pruning precisely at initialization. We modify IMP to search for subnetworks that could have been obtained by pruning early in training (0.1% to 7% through) rather than at iteration 0. With this change, it finds small subnetworks of deeper networks (e.g., 80% sparsity on Resnet-50) that can complete the training process to match the accuracy of the original network on more challenging tasks (e.g., ImageNet). In situations where IMP fails at iteration 0, the accuracy benefits of delaying pruning accrue rapidly over the earliest iterations of training. To explain these behaviors, we study subnetwork stability, finding that-as accuracy improves in this fashion-IMP subnetworks train to parameters closer to those of the full network and do so with improved consistency in the face of gradient noise. These results offer new insights into the opportunity to prune large-scale networks early in training and the behaviors underlying the lottery ticket hypothesis.

show abstract

Section: Rewinding On Deep Network For Imagenetmentioning

confidence: 99%

Stabilizing the Lottery Ticket Hypothesis

Frankle,

Dziugaite,

Roy

et al. 2019

Preprint

115

View full text Add to dashboard Cite

show abstract

“…Without loss of generality, we consider image classification tasks and use ResNet, as an example to discuss our proposed on-device learning solution. Image classification is important for many edge applications, and is also the target task of the related model compression and knowledge distillation works (Hinton et al, 2015;Han et al, 2015;Chen et al, 2015;Polino et al, 2018;Srinivas & Babu, 2015). ResNet is a modern architecture with streamlined convolutional layers.…”

Section: Filter Pruning Based Model Compressionmentioning

confidence: 99%

“…To deploy DNNs on resource-constrained devices, there are two general approaches. The first approach aims to compress already-trained models, using techniques such as weights sharing (Chen et al, 2015), quantization (Han et al, 2015;Kadetotad et al, 2016), and pruning (Han et al, 2015;LeCun et al, 1990;Srinivas & Babu, 2015). However, a compressed model generated by these approaches is useful only for inference; it cannot be retrained to capture user-or device-specific requirements or new data available at runtime.…”

Section: Introductionmentioning

confidence: 99%

Enabling Deep Learning on Edge Devices through Filter Pruning and Knowledge Transfer

Zhao¹,

Chen²,

Zhao³

2022

Preprint

View full text Add to dashboard Cite

Deep learning models have introduced various intelligent applications to edge devices, such as image classification, speech recognition, and augmented reality. There is an increasing need of training such models on the devices in order to deliver personalized, responsive, and private learning. To address this need, this paper presents a new solution for deploying and training state-of-the-art models on the resourceconstrained devices. First, the paper proposes a novel filter-pruning-based model compression method to create lightweight trainable models from large models trained in the cloud, without much loss of accuracy. Second, it proposes a novel knowledge transfer method to enable the on-device model to update incrementally in real time or near real time using incremental learning on new data and enable the on-device model to learn the unseen categories with the help of the in-cloud model in an unsupervised fashion. The results show that 1) our model compression method can remove up to 99.36% parameters of WRN-28-10, while preserving a Top-1 accuracy of over 90% on CIFAR-10; 2) our knowledge transfer method enables the compressed models to achieve more than 90% accuracy on CIFAR-10 and retain good accuracy on old categories; 3) it allows the compressed models to converge within real time (three to six minutes) on the edge for incremental learning tasks; 4) it enables the model to classify unseen categories of data (78.92% Top-1 accuracy) that it is never trained with.

show abstract

“…In this paper, we seek to answer following questions in the context of iterative structured pruning with rewinding: (Han, Mao, and Dally 2015;Kadetotad et al 2016), knowledge distillation (Polino, Pascanu, and Alistarh 2018;Yim et al 2017), neural architecture search (Zoph and Le 2016;Pham et al 2018) and pruning (Li et al 2016;Han, Mao, and Dally 2015;Srinivas and Babu 2015;Molchanov et al 2016). There has also been substantial work in manually designing new model topology, like Mo-bileNet (Howard et al 2017) and EfficientNet (Tan and Le 2019), that are suitable for edge device deployment but are less accurate compared to traditional models like ResNet (He et al 2016).…”

Section: Our Solutionmentioning

confidence: 99%

Iterative Activation-based Structured Pruning

Zhao¹,

Jain²,

Zhao³

2022

Preprint

View full text Add to dashboard Cite

Deploying complex deep learning models on edge devices is challenging because they have substantial compute and memory resource requirements, whereas edge devices' resource budget is limited. To solve this problem, extensive pruning techniques have been proposed for compressing networks. Recent advances based on the Lottery Ticket Hypothesis (LTH) show that iterative model pruning tends to produce smaller and more accurate models. However, LTH research focuses on unstructured pruning, which is hardwareinefficient and difficult to accelerate on hardware platforms. In this paper, we investigate iterative pruning in the context of structured pruning because structurally pruned models map well on commodity hardware. We find that directly applying a structured weight-based pruning technique iteratively, called iterative L1-norm based pruning (ILP), does not produce accurate pruned models. To solve this problem, we propose two activation-based pruning methods, Iterative Activation-based Pruning (IAP) and Adaptive Iterative Activation-based Pruning (AIAP). We observe that, with only 1% accuracy loss, IAP and AIAP achieve 7.75ˆand 15.88ˆcompression on LeNet-5, and 1.25ˆand 1.71ˆcompression on ResNet-50, whereas ILP achieves 4.77ˆand 1.13ˆ, respectively.

show abstract

Data-free parameter pruning for Deep Neural Networks

Cited by 71 publications

References 0 publications

Stabilizing the Lottery Ticket Hypothesis

Stabilizing the Lottery Ticket Hypothesis

Enabling Deep Learning on Edge Devices through Filter Pruning and Knowledge Transfer

Iterative Activation-based Structured Pruning

Contact Info

Product

Resources

About