2022
DOI: 10.48550/arxiv.2205.13571
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Low-rank lottery tickets: finding efficient low-rank neural networks via matrix differential equations

Abstract: Neural networks have achieved tremendous success in a large variety of applications. However, their memory footprint and computational demand can render them impractical in application settings with limited hardware or energy resources. In this work, we propose a novel algorithm to find efficient low-rank subnetworks. Remarkably, these subnetworks are determined and adapted already during the training phase and the overall time and memory resources required by both training and evaluating them is significantly… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

1
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 33 publications
1
0
0
Order By: Relevance
“…Our analytical results provide intuition for our observation that the rank of learning dynamics is limited by task complexity. This supports previous findings of low (matrix) rank weight changes in RNNs [13] and in deep networks [45,46,47]. Our results on the numerical matrix rank also has interesting ties to work that uses rank compression for more efficient training in deep networks [45], and for numerical solutions to systems with time-varying dynamics [48].…”
Section: Discussionsupporting
confidence: 91%
“…Our analytical results provide intuition for our observation that the rank of learning dynamics is limited by task complexity. This supports previous findings of low (matrix) rank weight changes in RNNs [13] and in deep networks [45,46,47]. Our results on the numerical matrix rank also has interesting ties to work that uses rank compression for more efficient training in deep networks [45], and for numerical solutions to systems with time-varying dynamics [48].…”
Section: Discussionsupporting
confidence: 91%