Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher

Zhu, Yichen; Wang, Yi

doi:10.1109/iccv48922.2021.00501

Cited by 47 publications

(14 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, this could result in a large model size and inefficient inference. In the ML community, researchers have already studied different techniques (e.g., weight quantization [58], [59], [101] and knowledge distillation [37], [67], [98], [185]) to build a lightweight model from a heavyweight one. For instance, Han et al [59] pruned the network, quantized parameters and compressed them using Huffman coding.…”

Section: Research Opportunitiesmentioning

confidence: 99%

DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization

Wang¹,

Han²

2022

Preprint

View full text Add to dashboard Cite

Since 2016, we have witnessed the tremendous growth of artificial intelligence+visualization (AI+VIS) research. However, existing survey papers on AI+VIS focus on visual analytics and information visualization, not scientific visualization (SciVis). In this paper, we survey related deep learning (DL) works in SciVis, specifically in the direction of DL4SciVis: designing DL solutions for solving SciVis problems. To stay focused, we primarily consider works that handle scalar and vector field data but exclude mesh data. We classify and discuss these works along six dimensions: domain setting, research task, learning type, network architecture, loss function, and evaluation metric. The paper concludes with a discussion of the remaining gaps to fill along the discussed dimensions and the grand challenges we need to tackle as a community. This state-of-the-art survey guides SciVis researchers in gaining an overview of this emerging topic and points out future directions to grow this research.

show abstract

Section: Research Opportunitiesmentioning

confidence: 99%

DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization

Wang¹,

Han²

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Researchers have to trade off the performance and the cost of deployment, especially in special scenes where the computing resources are strictly limited. Much work has explored the lightweight of DNNs, such as network pruning [31], [32], knowledge distillation [33], [34], and quantization [35], [36]. DNNs run with low precision operations during inference provide power and memory advantages over full precision, and it also benefits low-bit-width artificial intelligence chip design [37], [38].…”

Section: Total Direct Effectmentioning

confidence: 99%

CA-SpaceNet: Counterfactual Analysis for 6D Pose Estimation in Space

Wang¹,

Wang²,

Jiang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Reliable and stable 6D pose estimation of uncooperative space objects plays an essential role in on-orbit servicing and debris removal missions. Considering that the pose estimator is sensitive to background interference, this paper proposes a counterfactual analysis framework named CA-SpaceNet to complete robust 6D pose estimation of the space-borne targets under complicated background. Specifically, conventional methods are adopted to extract the features of the whole image in the factual case. In the counterfactual case, a non-existent image without the target but only the background is imagined. Side effect caused by background interference is reduced by counterfactual analysis, which leads to unbiased prediction in final results. In addition, we also carry out low-bit-width quantization for CA-SpaceNet and deploy part of the framework to a Processing-In-Memory (PIM) accelerator on FPGA. Qualitative and quantitative results demonstrate the effectiveness and efficiency of our proposed method. To our best knowledge, this paper applies causal inference and network quantization to the 6D pose estimation of space-borne targets for the first time. The code is available at https://github.com/Shunli-Wang/CA-SpaceNet.

show abstract

“…They demonstrated that the "dark knowledge" lies in the output distributions from a large capacity teacher network and benefits the student's representation learning. Recent works mainly explored to better transfer the "dark knowledge" and improve the efficiency from various aspects, such as reducing the difference between the teacher and student [3,5,18,34], designing student-friendly architecture [16,20], improving the distillation efficiency [7,14,27,29] and explaining the distillation's working mechanism [1,23].…”

Section: Knowledge Distillationmentioning

confidence: 99%

Efficient One Pass Self-distillation with Zipf's Label Smoothing

Liang¹,

Li²,

Bing³

et al. 2022

Preprint

View full text Add to dashboard Cite

Self-distillation exploits non-uniform soft supervision from itself during training and improves performance without any runtime cost. However, the overhead during training is often overlooked, and yet reducing time and memory overhead during training is increasingly important in the giant models' era. This paper proposes an efficient selfdistillation method named Zipf's Label Smoothing (Zipf's LS), which uses the on-the-fly prediction of a network to generate soft supervision that conforms to Zipf distribution without using any contrastive samples or auxiliary parameters. Our idea comes from an empirical observation that when the network is duly trained the output values of a network's final softmax layer, after sorting by the magnitude and averaged across samples, should follow a distribution reminiscent to Zipf's Law in the word frequency statistics of natural languages. By enforcing this property on the sample level and throughout the whole training period, we find that the prediction accuracy can be greatly improved. Using ResNet50 on the INAT21 fine-grained classification dataset, our technique achieves +3.61% accuracy gain compared to the vanilla baseline, and 0.88% more gain against the previous label smoothing or selfdistillation strategies. The implementation is publicly available at https: //github.com/megvii-research/zipfls.

show abstract

Student Customized Knowledge Distillation: Bridging the Gap Between Student and Teacher

Cited by 47 publications

References 17 publications

DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization

DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization

CA-SpaceNet: Counterfactual Analysis for 6D Pose Estimation in Space

Efficient One Pass Self-distillation with Zipf's Label Smoothing

Contact Info

Product

Resources

About