2019
DOI: 10.1007/978-3-030-11018-5_25
|View full text |Cite
|
Sign up to set email alerts
|

Training Compact Deep Learning Models for Video Classification Using Circulant Matrices

Abstract: In real world scenarios, model accuracy is hardly the only factor to consider. Large models consume more memory and are computationally more intensive, which makes them difficult to train and to deploy, especially on mobile devices. In this paper, we build on recent results at the crossroads of Linear Algebra and Deep Learning which demonstrate how imposing a structure on large weight matrices can be used to reduce the size of the model. We propose very compact models for video classification based on state-of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 21 publications
(46 reference statements)
0
2
0
Order By: Relevance
“…An alternative to pruning is to enforce sparse or structured matrices a-priori. By shifting and reusing the same row, implementing circulant matrix structures saves weights [1]. Alternatively, the frequency domain can help us impose sparse diagonal patterns onto the network weight matrices.…”
Section: B Network Compressionmentioning
confidence: 99%
“…An alternative to pruning is to enforce sparse or structured matrices a-priori. By shifting and reusing the same row, implementing circulant matrix structures saves weights [1]. Alternatively, the frequency domain can help us impose sparse diagonal patterns onto the network weight matrices.…”
Section: B Network Compressionmentioning
confidence: 99%
“…Motivated by these applications, extensive studies have been recently conducted for designing compact architectures [44,27,28,64,2] or compressing models [13,57,9,35]. However, most of the existing methods process all the frames in a given video at the same resolution.…”
Section: Introductionmentioning
confidence: 99%
“…Particularly, due to its scale insensitivity, average pooling allows ResNet models pre-trained on one input size to be effectively evaluated/applied on other input sizes with favourable results [12]. In addition, average pooling is known to be more robust than max pooling against outliers and noise [195]. It is also a conceptually simple method that has been empirically verified to outperform max pooling consistently over a range of CNN network architectures [196].…”
Section: Downsampling Operationmentioning
confidence: 99%