Progressive multi-level distillation learning for pruning network

Wang, Ruiqing; Wan, Shengmin; Wu, Zhang; Zhang, Chenlu; Liu, Yu; Xu, Shaoxiang; Zhang, Lifu; Jin, Xiu; Jiang, Zhaohui; Rao, Yuan

doi:10.1007/s40747-023-01036-0

Cited by 6 publications

(3 citation statements)

References 38 publications

(51 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our approach provides a clear advantage by enabling a greater compression of the teacher model compared to the previous methods that relied on pruning [10][11][12][13][14][15], knowledge distillation (KD) [1,3,5,7,8], and the combination of both [16][17][18][19][20]. The crucial aspect of our study is pruning followed by KD.…”

Section: Discussionmentioning

confidence: 99%

See 1 more Smart Citation

Efficient and Controllable Model Compression through Sequential Knowledge Distillation and Pruning

Malihi,

Heidemann

2023

BDCC

View full text Add to dashboard Cite

Efficient model deployment is a key focus in deep learning. This has led to the exploration of methods such as knowledge distillation and network pruning to compress models and increase their performance. In this study, we investigate the potential synergy between knowledge distillation and network pruning to achieve optimal model efficiency and improved generalization. We introduce an innovative framework for model compression that combines knowledge distillation, pruning, and fine-tuning to achieve enhanced compression while providing control over the degree of compactness. Our research is conducted on popular datasets, CIFAR-10 and CIFAR-100, employing diverse model architectures, including ResNet, DenseNet, and EfficientNet. We could calibrate the amount of compression achieved. This allows us to produce models with different degrees of compression while still being just as accurate, or even better. Notably, we demonstrate its efficacy by producing two compressed variants of ResNet 101: ResNet 50 and ResNet 18. Our results reveal intriguing findings. In most cases, the pruned and distilled student models exhibit comparable or superior accuracy to the distilled student models while utilizing significantly fewer parameters.

show abstract

Section: Discussionmentioning

confidence: 99%

“…Finally, Wang et al [20] introduce an innovative approach that combines structured pruning with multilevel distillation. By using pre-and post-pruning networks as teacherstudent pairs, they reduce the loss of accuracy through distillation and highlight the synergy between the two techniques.…”

Section: Combination Of Pruning and Knowledge Distillationmentioning

confidence: 99%

Efficient and Controllable Model Compression through Sequential Knowledge Distillation and Pruning

Malihi,

Heidemann

2023

BDCC

View full text Add to dashboard Cite

show abstract

“…Knowledge distillation [11][12][13] has been proven to be an effective model compression method, capable of addressing the aforementioned challenges. It achieves this by transferring the dark knowledge of a powerful yet complex teacher model to a lightweight student model, thereby improving the performance of the student model without incurring additional costs, thus achieving the goal of model compression.…”

Section: Introductionmentioning

confidence: 99%

Learning Lightweight Tea Detector with Reconstructed Feature and Dual Distillation

Zheng,

Zuo,

Zhang

et al. 2024

Preprint

Self Cite

View full text Add to dashboard Cite

Currently, image recognition based on deep neural networks has become the mainstream direction of research, and significant progress has been made in its application in the field of tea detection. Many deep models exhibit high recognition rates in tea leaves detection. However, deploying these models directly on tea-picking equipment in natural environments is impractical. The extremely high parameters and computational complexity of these models make it challenging to perform real-time tea leaves detection. Meanwhile, lightweight models struggle to achieve competitive detection accuracy. Therefore, this paper addresses the issue of computational resource constraints in remote mountain areas and proposes Reconstructed Feature and Dual Distillation (RFDD) to enhance the detection capability of lightweight models for tea leaves. In our method, the Reconstructed Feature selectively masks the feature of the student model based on the spatial attention map of the teacher model and utilizes a generation block to force the student model to generate the teacher’s full feature. The Dual Distillation comprises Decoupled Distillation and Global Distillation. Decoupled Distillation divides the reconstructed feature into foreground and background features based on the Ground-Truth. This compels the student model to allocate different attention to foreground and background, focusing on their critical pixels and channels. However, Decoupled Distillation leads to the loss of relation knowledge between foreground and background pixels. Therefore, we further perform Global Distillation to extract this lost knowledge. Since RFDD only requires loss calculation on feature map, it can be easily applied to various detectors. We conducted experiments on detectors with different frameworks, using a tea dataset captured at the Huangshan Houkui Tea Plantation. The experimental results indicate that, under the guidance of RFDD, the student detectors have achieved performance improvements to varying degrees. For instance, a one-stage detector like RetinaNet (ResNet-50) experienced a 3.14% increase in Average Precision (AP) after RFDD guidance. Similarly, a two-stage model like Faster RCNN (ResNet-50) obtained a 3.53% improvement in AP. This offers promising prospects for lightweight models to efficiently perform real-time tea leaves detection tasks.

show abstract

Using channel pruning–based YOLOv5 deep learning algorithm for accurately counting fish fry in real time

Xu,

Chen,

et al. 2024

Aquacult Int

View full text Add to dashboard Cite

Progressive multi-level distillation learning for pruning network

Cited by 6 publications

References 38 publications

Efficient and Controllable Model Compression through Sequential Knowledge Distillation and Pruning

Efficient and Controllable Model Compression through Sequential Knowledge Distillation and Pruning

Learning Lightweight Tea Detector with Reconstructed Feature and Dual Distillation

Using channel pruning–based YOLOv5 deep learning algorithm for accurately counting fish fry in real time

Contact Info

Product

Resources

About