Learning Student Networks via Feature Embedding

Chen, Hanting; Wang, Yunhe; Xu, Chang; Xu, Chao; Tao, Dacheng

doi:10.1109/tnnls.2020.2970494

Cited by 75 publications

(42 citation statements)

References 41 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In adding each relation model, the convolutional layers consider the impacts between each shallow part, and add L2 loss from its extracted features. According to knowledge distillation, 84‐87 all parts with corresponding relation model can be regarded as student models, and the deepest can be regarded as the teacher model.…”

Section: Proposed Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Learning to learn by yourself: Unsupervised meta‐learning with self‐knowledge distillation for COVID‐19 diagnosis from pneumonia cases

et al. 2021

View full text Add to dashboard Cite

The goal of diagnosing the coronavirus disease 2019 (COVID‐19) from suspected pneumonia cases, that is, recognizing COVID‐19 from chest X‐ray or computed tomography (CT) images, is to improve diagnostic accuracy, leading to faster intervention. The most important and challenging problem here is to design an effective and robust diagnosis model. To this end, there are three challenges to overcome: (1) The lack of training samples limits the success of existing deep‐learning‐based methods. (2) Many public COVID‐19 data sets contain only a few images without fine‐grained labels. (3) Due to the explosive growth of suspected cases, it is urgent and important to diagnose not only COVID‐19 cases but also the cases of other types of pneumonia that are similar to the symptoms of COVID‐19. To address these issues, we propose a novel framework called Unsupervised Meta‐Learning with Self‐Knowledge Distillation to address the problem of differentiating COVID‐19 from pneumonia cases. During training, our model cannot use any true labels and aims to gain the ability of learning to learn by itself. In particular, we first present a deep diagnosis model based on a relation network to capture and memorize the relation among different images. Second, to enhance the performance of our model, we design a self‐knowledge distillation mechanism that distills knowledge within our model itself. Our network is divided into several parts, and the knowledge in the deeper parts is squeezed into the shallow ones. The final results are derived from our model by learning to compare the features of images. Experimental results demonstrate that our approach achieves significantly higher performance than other state‐of‐the‐art methods. Moreover, we construct a new COVID‐19 pneumonia data set based on text mining, consisting of 2696 COVID‐19 images (347 X‐ray + 2349 CT), 10,155 images (9661 X‐ray + 494 CT) about other types of pneumonia, and the fine‐grained labels of all. Our data set considers not only a bacterial infection or viral infection which causes pneumonia but also a viral infection derived from the influenza virus or coronavirus.

show abstract

Section: Proposed Methodsmentioning

confidence: 99%

“…According to knowledge distillation,[84][85][86][87] all parts with corresponding relation model can be regarded as student models, and the deepest can be regarded as the teacher model.Relation models (the proposed self-knowledge distillation has multiple relation models within a whole network) in the neural network are denoted as ∕…”

mentioning

confidence: 99%

Learning to learn by yourself: Unsupervised meta‐learning with self‐knowledge distillation for COVID‐19 diagnosis from pneumonia cases

et al. 2021

View full text Add to dashboard Cite

show abstract

“…The time-dependent correlation is established as a dynamic model based on the assumption that students' learning ability in the current course is only influenced by that in the last course, which is similar to Markov Chain [22]. Specifically, we define a function f (•) to capture the temporal variations of learning ability, which is shown in (4).…”

Section: B Modeling the Learning Abilitymentioning

confidence: 99%

“…Recent years have witnessed the success of the massive open online courses (MOOCs) and intelligent tutoring system (ITS), which accelerated the development of the educational data mining (EDM). The EDM seeks to develop some methods to detect hidden communities [1], identify the implicit relationships [2], explore the key influential factors of students' engagement [3], and analyze student learning behaviors and social activities [4], [5], etc. For instance, KDD CUP 2015 issues a challenge of predicting students' dropout rate with their personal behaviors.…”

Section: Introductionmentioning

confidence: 99%

Modeling the Effort and Learning Ability of Students in MOOCs

Gao

Zhao

et al. 2019

IEEE Access

View full text Add to dashboard Cite

With the popularity of MOOCs and other online learning platforms, Educational Data Mining (EDM) has been receiving tremendous attention from researchers due to its great significance. Modeling students' effort and learning ability is a very interesting but challenging research topic. It is beneficial for student profiling, personalization recommendation, etc. Thus, numerous attempts have been devoted to this study. However, most of the existing work treat the problem in a static scenario, but they ignore the dynamic characteristics in real word applications. To address this problem, we propose a novel model to describe students' effort and learning ability (ELA) from a generative perspective. The temporal variations of both effort and learning ability of students are taken into account. To evaluate the performance of the proposed model, some extensive experiments are carried out. The experimental results have demonstrated that the proposed model outperforms other competitive methods greatly.

show abstract

“…Our new method combines the advantages of several model compression methods. Compared to the latest knowledge distillation methods [2,33], our method focuses on generating a student model from the original model by pruning. Therefore, we get a generated student network better suits the teacher network than manually selected network in simple knowledge distillation methods.…”

Section: Introductionmentioning

confidence: 99%

Knowledge from the original network: restore a better pruned network with knowledge distillation

Chen

et al. 2021

Complex Intell. Syst.

View full text Add to dashboard Cite

To deploy deep neural networks to edge devices with limited computation and storage costs, model compression is necessary for the application of deep learning. Pruning, as a traditional way of model compression, seeks to reduce the parameters of model weights. However, when a deep neural network is pruned, the accuracy of the network will significantly decrease. The traditional way to decrease the accuracy loss is fine-tuning. When over many parameters are pruned, the pruned network’s capacity is reduced heavily and cannot recover to high accuracy. In this paper, we apply the knowledge distillation strategy to abate the accuracy loss of pruned models. The original network of the pruned network was used as the teacher network, aiming to transfer the dark knowledge from the original network to the pruned sub-network. We have applied three mainstream knowledge distillation methods: response-based knowledge, feature-based knowledge, and relation-based knowledge (Gou et al. in Knowledge distillation: a survey. arXiv:200605525, 2020), and compare the result to the traditional fine-tuning method with grand-truth labels. Experiments have been done on the CIFAR100 dataset with several deep convolution neural network. Results show that the pruned network recovered by knowledge distillation with its original network performs better accuracy than it recovered by fine-tuning with sample labels. It has also been validated in this paper that the original network as the teacher performs better than differently structured networks with same accuracy as the teacher.

show abstract

Learning Student Networks via Feature Embedding

Cited by 75 publications

References 41 publications

Learning to learn by yourself: Unsupervised meta‐learning with self‐knowledge distillation for COVID‐19 diagnosis from pneumonia cases

Learning to learn by yourself: Unsupervised meta‐learning with self‐knowledge distillation for COVID‐19 diagnosis from pneumonia cases

Modeling the Effort and Learning Ability of Students in MOOCs

Knowledge from the original network: restore a better pruned network with knowledge distillation

Contact Info

Product

Resources

About