Baosheng Yu scite author profile

In recent years, deep neural networks have been successful in both industry and academia, especially for computer vision tasks. The great success of deep learning is mainly due to its scalability to encode large-scale data and to maneuver billions of model parameters. However, it is a challenge to deploy these cumbersome deep models on devices with limited resources, e.g., mobile phones and embedded devices, not only because of the high computational complexity but also the large storage requirements. To this end, a variety of model compression and acceleration techniques have been developed. As a representative type of model compression and acceleration, knowledge distillation effectively learns a small student model from a large teacher model. It has received rapid increasing attention from the community. This paper provides a comprehensive survey of knowledge distillation from the perspectives of knowledge categories, training schemes, teacher-student architecture, distillation algorithms, performance comparison and applications. Furthermore, challenges in knowledge distillation are briefly reviewed and comments on future research are discussed and forwarded.

show abstract

Correcting the Triplet Selection Bias for Triplet Loss

Liu

Gong

et al. 2018

View full text Add to dashboard Cite

Erratum: High-spin states and level structure inRb84[Phys. Rev. C 82, 014306 (2010)]

Shen¹,

Han²,

Wen³

et al. 2015

Phys. Rev. C

View full text Add to dashboard Cite

k e V and the (1 0 ") 3055-keV levels, betw een the ( ll~) 3 2 4 0 -k e V and the 11<+) 3 1 19-keV levels, and betw een the l l (+) 3 1 19-keV and the (10" ) 3055-keV levels are not given. T he energies are 285, 120, and 65 keV, respectively, as show n in the corrected level schem e below. A rrow sym bols are added to the lines denoting the 258-keV y ray betw een the (1 4 " )46 9 8 -k eV and the (1 4 " )4440-keV levels, the 193-keV y ray betw een the (1 4~)4 440-keV and the (13" )4247-keV levels, and the 345-keV y ray betw een the (13" ) 4247-keV and the (13" ) 3902-keV levels.T he corrections do not affect the results and conclusions o f the original paper.

show abstract

Band structures in106Pd

Zhu

et al. 2012

Phys. Rev. C

View full text Add to dashboard Cite

Deep Metric Learning With Tuplet Margin Loss

Tao

2019

View full text Add to dashboard Cite

Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers

Zhan

et al. 2022

View full text Add to dashboard Cite

SynFace: Face Recognition with Synthetic Data

Qiu

Gong

et al. 2021

View full text Add to dashboard Cite

Unsupervised Domain Adaptation on Reading Comprehension

Cao

Fang

et al. 2020

AAAI

View full text Add to dashboard Cite

Reading comprehension (RC) has been studied in a variety of datasets with the boosted performance brought by deep neural networks. However, the generalization capability of these models across different domains remains unclear. To alleviate the problem, we investigate unsupervised domain adaptation on RC, wherein a model is trained on the labeled source domain and to be applied to the target domain with only unlabeled samples. We first show that even with the powerful BERT contextual representation, a model can not generalize well from one domain to another. To solve this, we provide a novel conditional adversarial self-training method (CASe). Specifically, our approach leverages a BERT model fine-tuned on the source dataset along with the confidence filtering to generate reliable pseudo-labeled samples in the target domain for self-training. On the other hand, it further reduces domain distribution discrepancy through conditional adversarial learning across domains. Extensive experiments show our approach achieves comparable performance to supervised models on multiple large-scale benchmark datasets.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Baosheng Yu

Knowledge Distillation: A Survey

Correcting the Triplet Selection Bias for Triplet Loss

Erratum: High-spin states and level structure inRb84[Phys. Rev. C 82, 014306 (2010)]

Band structures in106Pd

Deep Metric Learning With Tuplet Margin Loss

Learning Affinity from Attention: End-to-End Weakly-Supervised Semantic Segmentation with Transformers

SynFace: Face Recognition with Synthetic Data

Unsupervised Domain Adaptation on Reading Comprehension

Contact Info

Product

Resources

About