2023
DOI: 10.1007/s40747-023-01036-0
|View full text |Cite
|
Sign up to set email alerts
|

Progressive multi-level distillation learning for pruning network

Abstract: Although the classification method based on the deep neural network has achieved excellent results in classification tasks, it is difficult to apply to real-time scenarios because of high memory footprints and prohibitive inference times. Compared to unstructured pruning, structured pruning techniques can reduce the computation cost of the model runtime more effectively, but inevitably reduces the precision of the model. Traditional methods use fine tuning to restore model damage performance. However, there is… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 38 publications
(51 reference statements)
0
1
0
Order By: Relevance
“…Our approach provides a clear advantage by enabling a greater compression of the teacher model compared to the previous methods that relied on pruning [10][11][12][13][14][15], knowledge distillation (KD) [1,3,5,7,8], and the combination of both [16][17][18][19][20]. The crucial aspect of our study is pruning followed by KD.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our approach provides a clear advantage by enabling a greater compression of the teacher model compared to the previous methods that relied on pruning [10][11][12][13][14][15], knowledge distillation (KD) [1,3,5,7,8], and the combination of both [16][17][18][19][20]. The crucial aspect of our study is pruning followed by KD.…”
Section: Discussionmentioning
confidence: 99%
“…Finally, Wang et al [20] introduce an innovative approach that combines structured pruning with multilevel distillation. By using pre-and post-pruning networks as teacherstudent pairs, they reduce the loss of accuracy through distillation and highlight the synergy between the two techniques.…”
Section: Combination Of Pruning and Knowledge Distillationmentioning
confidence: 99%
“…Knowledge distillation [11][12][13] has been proven to be an effective model compression method, capable of addressing the aforementioned challenges. It achieves this by transferring the dark knowledge of a powerful yet complex teacher model to a lightweight student model, thereby improving the performance of the student model without incurring additional costs, thus achieving the goal of model compression.…”
Section: Introductionmentioning
confidence: 99%