2021
DOI: 10.48550/arxiv.2107.03356
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

M-FAC: Efficient Matrix-Free Approximations of Second-Order Information

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…Various methods exist to select the candidate weights for removal, including magnitude pruning [20], which selects weights with lower absolute values, and gradient-based methods that use the gradient applied to each weight to identify those that are trending towards to zero faster. Within the gradient-based methods, we can find first-order techniques based on the first-derivative information [31,38], and second-order ones [9,21,23], which pursue to find the set of weights whose removal will generate a minimum loss increase in the network. Secondorder methods have proven to be effective in pruning convolutional networks in the past, but they have recently been optimized for Large Language Models (LLMs) [21].…”
Section: Network Pruningmentioning
confidence: 99%
“…Various methods exist to select the candidate weights for removal, including magnitude pruning [20], which selects weights with lower absolute values, and gradient-based methods that use the gradient applied to each weight to identify those that are trending towards to zero faster. Within the gradient-based methods, we can find first-order techniques based on the first-derivative information [31,38], and second-order ones [9,21,23], which pursue to find the set of weights whose removal will generate a minimum loss increase in the network. Secondorder methods have proven to be effective in pruning convolutional networks in the past, but they have recently been optimized for Large Language Models (LLMs) [21].…”
Section: Network Pruningmentioning
confidence: 99%
“…Second-order pruning methods, e.g. [14,22,35,54,56] augment this basic metric with second-order information, which can lead to higher accuracy of the resulting pruned models, relative to GMP.…”
Section: Sparsification Techniquesmentioning
confidence: 99%
“…We measure the results of sparse transfer with full and linear finetuning on the same downstream tasks starting from dense ImageNet models pruned using regularization-based and post-training pruning methods. Specifically, we use AC/DC, STR and M-FAC [14], respectively.…”
Section: F Experiments On Mobilenetv1mentioning
confidence: 99%
See 1 more Smart Citation
“…It is a popular technique to reduce the growing energy and performance costs of neural networks and make it feasible to deploy them in resource-constrained environments such as smart devices. Various approaches have been developed to perform pruning as this has gained considerable attention over the past few years (Zhu & Gupta, 2017;Sui et al, 2021;Liebenwein et al, 2021;Peste et al, 2021;Frantar et al, 2021;Deng et al, 2020).…”
Section: Introductionmentioning
confidence: 99%