2018
DOI: 10.48550/arxiv.1811.07056
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Domain Adaptive Transfer Learning with Specialist Models

Abstract: Transfer learning is a widely used method to build high performing computer vision models. In this paper, we study the efficacy of transfer learning by examining how the choice of data impacts performance. We find that more pre-training data does not always help, and transfer performance depends on a judicious choice of pre-training data. These findings are important given the continued increase in dataset sizes. We further propose domain adaptive transfer learning, a simple and effective pre-training method u… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
69
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 42 publications
(72 citation statements)
references
References 31 publications
2
69
0
Order By: Relevance
“…Large scale transfer learning by pre-training on JFT , Dosovitskiy et al, 2020, Ryoo et al, 2021, Mustafa et al, 2021, Tay et al, 2021a, Puigcerver et al, 2020, Ngiam et al, 2018 or ImageNet21K [Dosovitskiy et al, 2020, Mustafa et al, 2021, Puigcerver et al, 2020 has been done extensively. Mensink et al [2021] considers a two-step transfer chain, where the model is pre-trained on ImageNet, fine-tuned on the source task and then transferred to the target task.…”
Section: Appendixmentioning
confidence: 99%
“…Large scale transfer learning by pre-training on JFT , Dosovitskiy et al, 2020, Ryoo et al, 2021, Mustafa et al, 2021, Tay et al, 2021a, Puigcerver et al, 2020, Ngiam et al, 2018 or ImageNet21K [Dosovitskiy et al, 2020, Mustafa et al, 2021, Puigcerver et al, 2020 has been done extensively. Mensink et al [2021] considers a two-step transfer chain, where the model is pre-trained on ImageNet, fine-tuned on the source task and then transferred to the target task.…”
Section: Appendixmentioning
confidence: 99%
“…These methods improve clearly against the counterpart trained with only clean labeled data and achieve stronger transfer performance. On the other hand, Domain Adaptive Transfer (DAT) (Ngiam et al, 2018) studies the influence of data quality and finds that using more data does not necessarily lead to better transferability, especially when the dataset is extremely large. Thus, an importance weighting strategy is proposed to carefully choose the pre-training data that are most relevant to the target task.…”
Section: Supervised Pre-trainingmentioning
confidence: 99%
“…1. More closely related datasets can be better than more source data for pre-training [28,22,6,8,33].…”
Section: Related Workmentioning
confidence: 99%
“…Convolutional neural networks (CNNs) have achieved many successes in image classification in recent years [17,9,19,24,25]. It has been consistently demonstrated that CNNs work best when there is abundant labelled data available for the task and very deep models can be trained [28,22,14]. However, there are many real world scenarios where the large amounts of training data required to obtain the best performance cannot be met or are prohibitively expensive.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation