2021
DOI: 10.48550/arxiv.2111.01697
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Low-Rank+Sparse Tensor Compression for Neural Networks

Abstract: Low-rank tensor compression has been proposed as a promising approach to reduce the memory and compute requirements of neural networks for their deployment on edge devices. Tensor compression reduces the number of parameters required to represent a neural network weight by assuming network weights possess a coarse higher-order structure. This coarse structure assumption has been applied to compress large neural networks such as VGG and ResNet. However modern state-of-the-art neural networks for computer vision… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…The proposed HMC results in a precision loss of only 0.11% at 2.07x. When the compression ratio reaches 5.56x (higher than 5.36x and 5.50x in literature PLOS ONE [38,39]), HMC only results in 0.94% accuracy loss, which is much lower than ATMC's 2.01% and 1.44% in literature [39]. This shows that our method achieves a higher compression ratio and better compression Results on ImageNet.…”
Section: Plos Onementioning
confidence: 61%
See 1 more Smart Citation
“…The proposed HMC results in a precision loss of only 0.11% at 2.07x. When the compression ratio reaches 5.56x (higher than 5.36x and 5.50x in literature PLOS ONE [38,39]), HMC only results in 0.94% accuracy loss, which is much lower than ATMC's 2.01% and 1.44% in literature [39]. This shows that our method achieves a higher compression ratio and better compression Results on ImageNet.…”
Section: Plos Onementioning
confidence: 61%
“…In the existing work, some scholars have studied the compression of the VGG network [37] and ResNet network [38] by combining low-rank decomposition and sparse representation, while others have studied the low-rank + sparse weight compression of SOTA architecture that relies on efficient depth-separable convolution [39]. These methods apply additive lowrank plus sparse compression to the weights of the neural network, as shown in Fig 1 , and can obtain better compression results than sparse compression or low-rank decomposition alone.…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, our method is crucial to remedy this drawback. Yu et al (2017); Hawkins et al (2021) have applied the low-rank and sparse compression to CNN. They mask out some kernels in a convolution layer as the sparse approximation and add two sequential convolutional layers that are parallel to the sparse convolutional layer as the low-rank approximation.…”
Section: Discussionmentioning
confidence: 99%
“…In that case, low-rank approximations are designed to store shared features across all coherent parts of neurons, and sparse approximations aim to learn distinct features from incoherent parts of neurons. Besides, previous work (Yu et al, 2017;Hawkins et al, 2021;Chen et al, 2021) applied a similar method to Convolutional Neural Networks (CNN) and parameter-efficient fine-tuning, but we will discuss the limitation of their methods in Section 5.…”
Section: Introductionmentioning
confidence: 99%