2014
DOI: 10.48550/arxiv.1412.6553
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition

Abstract: We propose a simple two-step approach for speeding up convolution layers within large convolutional neural networks based on tensor decomposition and discriminative fine-tuning. Given a layer, we use non-linear least squares to compute a low-rank CP-decomposition of the 4D convolution kernel tensor into a sum of a small number of rank-one tensors. At the second step, this decomposition is used to replace the original convolutional layer with a sequence of four convolutional layers with small kernels. After suc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
212
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 129 publications
(213 citation statements)
references
References 16 publications
1
212
0
Order By: Relevance
“…We take a model and perform tensor decomposition on all convolutional kernels and FC layers except the last layer. This idea was first studied in Lebedev et al (2014) on larger, more compressible architectures. We report our results in Figure 7.…”
Section: Low-rank Onlymentioning
confidence: 99%
See 1 more Smart Citation
“…We take a model and perform tensor decomposition on all convolutional kernels and FC layers except the last layer. This idea was first studied in Lebedev et al (2014) on larger, more compressible architectures. We report our results in Figure 7.…”
Section: Low-rank Onlymentioning
confidence: 99%
“…Therefore, in this paper we target memory cost reduction. In this area low-rank tensor compression is a popular approach (Garipov et al, 2016;Novikov et al, 2015;Lebedev et al, 2014) that can achieve orders of magnitude compression, but can lead to significant accuracy loss.…”
Section: Introductionmentioning
confidence: 99%
“…Different tensor decompositions result in different TNNs, e.g. , CP-Nets [24], Tucker-Nets [4], [21], HT-Nets [48], BT-Nets [46], TT-Nets [7], [30] and TR-Nets [44] corresponding to CANDECOMP/PARAFAC (CP) decomposition [13], Tucker decomposition [41], Hierarchical Tucker (HT) decomposition [10], Block-Term (BT) decomposition [5], Tensor Train (TT) [32] and Tensor Ring (TR) [50], respectively. Tjandra et al [39] show that TT-format has a better performance compared with Tucker-format with the same number of parameters required in the recurrent neural network (RNN).…”
Section: Introductionmentioning
confidence: 99%
“…Many methods have been proposed for CNN compression. For example: weight quantization [2,4], tensor lowrank factorization [23,29], network pruning [14,13,61, 19], and knowledge distillation [21,48]. Among them all, a combination of channel pruning and knowledge distillation is the preferable method to learn smaller dense models, which can easily leverage Basic Linear Algebra Subprograms (BLAS) libraries [31].…”
Section: Introductionmentioning
confidence: 99%