Curriculum Temperature for Knowledge Distillation

Li, Zheng; Li, Xiang; Yang, Liu; Zhao, Baohua; Song, Ren‐Jie; Luo, Lei; Li, Jun; Yang, Jufeng

doi:10.1609/aaai.v37i2.25236

Cited by 30 publications

(3 citation statements)

References 34 publications

(50 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Refer to Algorithm 2 for details. In [38], [54], it was experimentally proven that the value of α has little influence on the final classification performance. Therefore, they fixed α to 1 in different datasets, while varying the value of β.…”

Section: E Algorithm Implementation Processmentioning

confidence: 99%

Decoupled Knowledge Distillation via Spatial Feature Blurring for Hyperspectral Image Classification

Xie,

Zhang,

Jiao

et al. 2024

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

Section: E Algorithm Implementation Processmentioning

confidence: 99%

Decoupled Knowledge Distillation via Spatial Feature Blurring for Hyperspectral Image Classification

Xie,

Zhang,

Jiao

et al. 2024

IEEE J. Sel. Top. Appl. Earth Observations Remote Sensing

View full text Add to dashboard Cite

“…WSLD [32] introduced a weighted soft-label approach and assigned a dynamic weight to the distillation loss based on the student's and teacher's learning on the supervised task. CTKD [33] proposed a dynamic temperature hyperparameter distillation framework. This framework increases distillation loss by adjusting the temperature adversarially, allowing the student to conduct knowledge transfer from easy to complex.…”

Section: Active Knowledge Distillationmentioning

confidence: 99%

“…These weighted base classifiers are then integrated to generate a robust classifier. Utilizing AdaBoost regulations during deep neural network training has been shown to improve the representation power of network models [25][26][27][28][29][30][31][32][33]. For instance, Taherkhani et al [27] proposed AdaBoost-CNN by combining AdaBoost with a convolutional neural network (CNN), successfully addressing the multi-class imbalanced sample classification issue.…”

Section: Introductionmentioning

confidence: 99%

Variational AdaBoost knowledge distillation for skin lesion classification in dermatology images

Yu,

Xiong,

et al. 2024

Complex Intell. Syst.

View full text Add to dashboard Cite

Knowledge Distillation has shown promising results for classifying skin lesions in dermatology images. Traditional knowledge distillation typically involves the student model passively mimicking the teacher model's knowledge. We propose utilizing AdaBoost to enable the student to actively mine the teacher's learning representation for skin lesion classification. This paradigm allows the student to determine the “granularity” in mining the teacher's knowledge. As the student's learning process progresses, it can become challenging to pinpoint specific learning difficulties, especially with potential interference from the teacher. To address this issue, we introduce a variational difficulty mining strategy to reduce the impact of such interference. This strategy involves the distillation module capturing more nuanced classification difficulties by extracting information from the node's $$l$$ l th hops. By maximizing the mutual information between the teacher and student, we effectively filter out noise interference from these nuanced difficulties. Our proposed framework, Variational AdaBoost Knowledge Distillation (VAdaKD), allows the student to actively mine and leverage the teacher's knowledge for improved skin lesion classification. Our proposed method performs satisfactorily on three benchmark datasets: the Dermnet dataset, ISIC 2019 dataset, and HAM10000 dataset, respectively. Specifically, our method shows an improvement of 2–3% over the baseline on the Dermnet dataset and outperforms the best results of the other compared methods by 1%. Experimental results and visualization performance indicate that our proposed method effectively captures the learning difficulties and achieves better visualized t-distributed stochastic neighbor embedding classification results. Our code is available at https://github.com/25brilliant/VAdaKD.

show abstract

Improving Diversity in Black-Box Few-Shot Knowledge Distillation

Vo,

Nguyen,

et al. 2024

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Curriculum Temperature for Knowledge Distillation

Cited by 30 publications

References 34 publications

Decoupled Knowledge Distillation via Spatial Feature Blurring for Hyperspectral Image Classification

Decoupled Knowledge Distillation via Spatial Feature Blurring for Hyperspectral Image Classification

Variational AdaBoost knowledge distillation for skin lesion classification in dermatology images

Improving Diversity in Black-Box Few-Shot Knowledge Distillation

Contact Info

Product

Resources

About