Knowledge Distillation for Semi-supervised Domain Adaptation

Orbes-Arteaga, Mauricio; Cardoso, M. Jorge; Sørensen, Lauge; Igel, Christian; Ourselin, Sébastien; Modat, Marc; Nielsen, Mads Eggert; Pai, Akshay

doi:10.1007/978-3-030-32695-1_8

Cited by 25 publications

(9 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Among the surveyed 31 symmetric approaches, direct approaches operated on the feature representations across domains by minimizing their differences (via mutual information [ 63 ], maximum mean discrepancy [ 46 , 49 , 64 ], Euclidean distance [ 65 , 66 , 67 , 68 , 69 , 70 , 71 ], Wasserstein distance [ 72 ], and average likelihood [ 73 ]), maximizing their correlation [ 74 , 75 ] or covariance [ 36 ], and introducing sparsity with L1/L2 norms [ 42 , 76 ]. On the other hand, indirect approaches were applied via adversarial training [ 28 , 41 , 54 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , 84 , 85 ], and knowledge distillation [ 86 ].…”

Section: Resultsmentioning

confidence: 99%

“…Additionally, various indirect symmetric feature-based approaches jointly optimized an adversarial loss and a task-specific loss on the source domain images [ 79 , 80 , 83 ]. Finally, Orbes-Arteainst et al [ 86 ] used knowledge distillation, training a teacher model on the labeled source domain, and optimizing a student network on the probabilistic maps from the teacher model derived with the source and target domain images.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Transfer Learning in Magnetic Resonance Brain Imaging: A Systematic Review

et al. 2021

View full text Add to dashboard Cite

(1) Background: Transfer learning refers to machine learning techniques that focus on acquiring knowledge from related tasks to improve generalization in the tasks of interest. In magnetic resonance imaging (MRI), transfer learning is important for developing strategies that address the variation in MR images from different imaging protocols or scanners. Additionally, transfer learning is beneficial for reutilizing machine learning models that were trained to solve different (but related) tasks to the task of interest. The aim of this review is to identify research directions, gaps in knowledge, applications, and widely used strategies among the transfer learning approaches applied in MR brain imaging; (2) Methods: We performed a systematic literature search for articles that applied transfer learning to MR brain imaging tasks. We screened 433 studies for their relevance, and we categorized and extracted relevant information, including task type, application, availability of labels, and machine learning methods. Furthermore, we closely examined brain MRI-specific transfer learning approaches and other methods that tackled issues relevant to medical imaging, including privacy, unseen target domains, and unlabeled data; (3) Results: We found 129 articles that applied transfer learning to MR brain imaging tasks. The most frequent applications were dementia-related classification tasks and brain tumor segmentation. The majority of articles utilized transfer learning techniques based on convolutional neural networks (CNNs). Only a few approaches utilized clearly brain MRI-specific methodology, and considered privacy issues, unseen target domains, or unlabeled data. We proposed a new categorization to group specific, widely-used approaches such as pretraining and fine-tuning CNNs; (4) Discussion: There is increasing interest in transfer learning for brain MRI. Well-known public datasets have clearly contributed to the popularity of Alzheimer’s diagnostics/prognostics and tumor segmentation as applications. Likewise, the availability of pretrained CNNs has promoted their utilization. Finally, the majority of the surveyed studies did not examine in detail the interpretation of their strategies after applying transfer learning, and did not compare their approach with other transfer learning approaches.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

Transfer Learning in Magnetic Resonance Brain Imaging: A Systematic Review

et al. 2021

View full text Add to dashboard Cite

show abstract

“…Buciluǎ et al [11] have proposed compressing a large model into a simple model which reduces space requirements and increases inference speed at the cost of a small performance loss. This idea has been revisited in [12] under the name knowledge distillation (KD) and gathered significant amount of attention [13,14,15,16,17]. 1 KD considers a teacher network and a student network.…”

Section: Related Workmentioning

confidence: 99%

Label-similarity Curriculum Learning

Doǧan¹,

Deshmukh²,

Machura³

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

Curriculum learning can improve neural network training by guiding the optimization to desirable optima. We propose a novel curriculum learning approach for image classification that adapts the loss function by changing the label representation. The idea is to use a probability distribution over classes as target label, where the class probabilities reflect the similarity to the true class. Gradually, this label representation is shifted towards the standard one-hot-encoding. That is, in the beginning minor mistakes are corrected less than large mistakes, resembling a teaching process in which broad concepts are explained first before subtle differences are taught.The class similarity can be based on prior knowledge. For the special case of the labels being natural words, we propose a generic way to automatically compute the similarities. The natural words are embedded into Euclidean space using a standard word embedding. The probability of each class is then a function of the cosine similarity between the vector representations of the class and the true label.The proposed label-similarity curriculum learning (LCL) approach was empirically evaluated on several popular deep learning architectures for image classification task applied to three datasets, ImageNet, CIFAR100, and AWA2. In all scenarios, LCL was able to improve the classification accuracy on the test data compared to standard training.

show abstract

“…Therefore, other unsupervised domain adaptation methods are active research topics in medical image analysis. A recent work of Orbes-Arteainst et al ( 2019 ) proposed an unsupervised domain adaptation approach in a similar fashion to transfer learning with teacher-student learning strategy. The authors used knowledge-distillation technique where a supervised teacher model is used to train a student network by generating soft labels for the target domain.…”

Section: Introductionmentioning

confidence: 99%

Transductive Transfer Learning for Domain Adaptation in Brain Magnetic Resonance Image Segmentation

et al. 2021

View full text Add to dashboard Cite

Segmentation of brain images from Magnetic Resonance Images (MRI) is an indispensable step in clinical practice. Morphological changes of sub-cortical brain structures and quantification of brain lesions are considered biomarkers of neurological and neurodegenerative disorders and used for diagnosis, treatment planning, and monitoring disease progression. In recent years, deep learning methods showed an outstanding performance in medical image segmentation. However, these methods suffer from generalisability problem due to inter-centre and inter-scanner variabilities of the MRI images. The main objective of the study is to develop an automated deep learning segmentation approach that is accurate and robust to the variabilities in scanner and acquisition protocols. In this paper, we propose a transductive transfer learning approach for domain adaptation to reduce the domain-shift effect in brain MRI segmentation. The transductive scenario assumes that there are sets of images from two different domains: (1) source—images with manually annotated labels; and (2) target—images without expert annotations. Then, the network is jointly optimised integrating both source and target images into the transductive training process to segment the regions of interest and to minimise the domain-shift effect. We proposed to use a histogram loss in the feature level to carry out the latter optimisation problem. In order to demonstrate the benefit of the proposed approach, the method has been tested in two different brain MRI image segmentation problems using multi-centre and multi-scanner databases for: (1) sub-cortical brain structure segmentation; and (2) white matter hyperintensities segmentation. The experiments showed that the segmentation performance of a pre-trained model could be significantly improved by up to 10%. For the first segmentation problem it was possible to achieve a maximum improvement from 0.680 to 0.799 in average Dice Similarity Coefficient (DSC) metric and for the second problem the average DSC improved from 0.504 to 0.602. Moreover, the improvements after domain adaptation were on par or showed better performance compared to the commonly used traditional unsupervised segmentation methods (FIRST and LST), also achieving faster execution time. Taking this into account, this work presents one more step toward the practical implementation of deep learning algorithms into the clinical routine.

show abstract

Knowledge Distillation for Semi-supervised Domain Adaptation

Cited by 25 publications

References 10 publications

Transfer Learning in Magnetic Resonance Brain Imaging: A Systematic Review

Transfer Learning in Magnetic Resonance Brain Imaging: A Systematic Review

Label-similarity Curriculum Learning

Transductive Transfer Learning for Domain Adaptation in Brain Magnetic Resonance Image Segmentation

Contact Info

Product

Resources

About