Sparse low rank factorization for deep neural network compression

Swaminathan, Sridhar; Garg, Deepak; Kannan, Rajkumar; Andrès, Frédéric

doi:10.1016/j.neucom.2020.02.035

Cited by 96 publications

(69 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Low-rank decomposition: Low-rank decomposition algorithms [ 30 , 31 , 32 ] use a lower-rank set instead of the original set of parameters to approximate the CNN to achieve compression. Swaminathan et al [ 31 ] argue that the low-rank decomposition of weight matrices should consider influence of both input as well as output neurons of a layer. They propose a sparse low rank (SLR) approach that sparsifies SVD matrices to obtain better compression rate by keeping lower rank for unimportant neurons.…”

Section: Related Workmentioning

confidence: 99%

Implementation of Lightweight Convolutional Neural Networks via Layer-Wise Differentiable Compression

Diao

Hao

et al. 2021

Sensors

View full text Add to dashboard Cite

Convolutional neural networks (CNNs) have achieved significant breakthroughs in various domains, such as natural language processing (NLP), and computer vision. However, performance improvement is often accompanied by large model size and computation costs, which make it not suitable for resource-constrained devices. Consequently, there is an urgent need to compress CNNs, so as to reduce model size and computation costs. This paper proposes a layer-wise differentiable compression (LWDC) algorithm for compressing CNNs structurally. A differentiable selection operator OS is embedded in the model to compress and train the model simultaneously by gradient descent in one go. Instead of pruning parameters from redundant operators by contrast to most of the existing methods, our method replaces the original bulky operators with more lightweight ones directly, which only needs to specify the set of lightweight operators and the regularization factor in advance, rather than the compression rate for each layer. The compressed model produced by our method is generic and does not need any special hardware/software support. Experimental results on CIFAR-10, CIFAR-100 and ImageNet have demonstrated the effectiveness of our method. LWDC obtains more significant compression than state-of-the-art methods in most cases, while having lower performance degradation. The impact of lightweight operators and regularization factor on the compression rate and accuracy also is evaluated.

show abstract

Section: Related Workmentioning

confidence: 99%

Implementation of Lightweight Convolutional Neural Networks via Layer-Wise Differentiable Compression

Diao

Hao

et al. 2021

Sensors

View full text Add to dashboard Cite

show abstract

“…The parameter pruning, hashing and quantisation methods explore model parameter redundancy and seek to remove redundant and uncritical parameters. Low-ranking factorisation based techniques [40], [41] employ matrix/tensor decomposition to estimate informative parameters of the DNNs. Convolutional filters based on methods utilise compact/transferred convolutional techniques design specially-tailored convolutional filters to reduce the parameter space and, to save storage and computation.…”

Section: Active Research Problems In Deep Learningmentioning

confidence: 99%

On Deep Research Problems in Deep Learning

Ulhaq¹

2021

Preprint

View full text Add to dashboard Cite

The subject of deep learning has emerged in the last decade as one of the most promising approaches to machine learning. Today, certainly, much of the recent progress in artificial intelligence is due to it, but research challenges are still unresolved and remain open to the research community. This paper attempts to offer a comprehensive review of deep learning progress in active research frontiers. On the one side, by presenting a brief overview of deep learning success, we inspire researchers to work in deep learning. On the other hand, we examine a range of technical issues, and open research issues that we believe are relevant topics for exploratory research. As deep learning applies to various fields, we restrict this paper’s scope to visual recognition tasks to analyze these problems with a specific lens. However, these problems will be broadly applicable to other fields. It will make it easier for new researchers to recognize outstanding research problems in the deep learning domain.

show abstract

“…Effective methods for reducing the size of models and the computation parameters include the use of information compression methods such as SqueezeNet [27] and depth-wise separable filters like the MobileNets [28]. Post-training model optimization can instead be achieved without important loss of performance, by employing techniques like quantization [29], factorization [30], distillation [31] and pruning [32]. Edge efficient models development has recently led to an industry movement toward such a framework.…”

Section: Introductionmentioning

confidence: 99%

Few-Shot User-Definable Radar-Based Hand Gesture Recognition at the Edge

et al. 2022

View full text Add to dashboard Cite

Technological advances and scalability are leading Human-Computer Interaction (HCI) to evolve towards intuitive forms, such as through gesture recognition. Among the various interaction strategies, radar-based recognition is emerging as a touchless, privacy-secure, and versatile solution in different environmental conditions. Classical radar-based gesture HCI solutions involve deep learning but require training on large and varied datasets to achieve robust prediction. Innovative self-learning algorithms can help tackling this problem by recognizing patterns and adapt from similar contexts. Yet, such approaches are often computationally expensive and hardly integrable into hardware-constrained solutions. In this paper, we present a gesture recognition algorithm which is easily adaptable to new users and contexts. We exploit an optimization-based meta-learning approach to enable gesture recognition in learning sequences. This method targets at learning the best possible initialization of the model parameters, simplifying training on new contexts when small amounts of data are available. The reduction in computational cost is achieved by processing the radar sensed data of gestures in the form of time maps, to minimize the input data size. This approach enables the adaptation of simple convolutional neural network (CNN) to new hand poses, thus easing the integration of the model into a hardware-constrained platform. Moreover, the use of a Variational Autoencoders (VAE) to reduce the gestures' dimensionality leads to a model size decrease of an order of magnitude and to half of the required adaptation time. The proposed framework, deployed on the Intel ® Neural Compute Stick 2 (NCS 2), leads to an average accuracy of around 84% for unseen gestures when only one example per class is utilized at training time. The accuracy increases up to 92.6% and 94.2% when three and five samples per class are used.

show abstract

Sparse low rank factorization for deep neural network compression

Cited by 96 publications

References 14 publications

Implementation of Lightweight Convolutional Neural Networks via Layer-Wise Differentiable Compression

Implementation of Lightweight Convolutional Neural Networks via Layer-Wise Differentiable Compression

On Deep Research Problems in Deep Learning

Few-Shot User-Definable Radar-Based Hand Gesture Recognition at the Edge

Contact Info

Product

Resources

About