Effect of pre-training scale on intra- and inter-domain, full and few-shot transfer learning for natural and X-Ray chest images

Cherti, Mehdi; Jitsev, Jenia

doi:10.1109/ijcnn55064.2022.9892393

Cited by 14 publications

(7 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Even though the horizontal stride of the modified network is different, which can change the horizontal scale and appearance of features in deeper layers of the network, using existing ImageNet-pretrained weights is still a sensible initialization procedure. ImageNet pretraining has been shown to be consistently beneficial in a wide array of image classification tasks, some of which have different image dimensions, scales of objects appearing in the images, and even cover an entirely different domain of images than the ImageNet-1k dataset [56,57]. We empirically validate the contribution of pretraining for weight initialization in our experiments.…”

mentioning

confidence: 60%

Deep Learning Approach for Object Classification on Raw and Reconstructed GBSAR Data

et al. 2022

View full text Add to dashboard Cite

The availability of low-cost microwave components today enables the development of various high-frequency sensors and radars, including Ground-based Synthetic Aperture Radar (GBSAR) systems. Similar to optical images, radar images generated by applying a reconstruction algorithm on raw GBSAR data can also be used in object classification. The reconstruction algorithm provides an interpretable representation of the observed scene, but may also negatively influence the integrity of obtained raw data due to applied approximations. In order to quantify this effect, we compare the results of a conventional computer vision architecture, ResNet18, trained on reconstructed images versus one trained on raw data. In this process, we focus on the task of multi-label classification and describe the crucial architectural modifications that are necessary to process raw data successfully. The experiments are performed on a novel multi-object dataset RealSAR obtained using a newly developed 24 GHz (GBSAR) system where the radar images in the dataset are reconstructed using the Omega-k algorithm applied to raw data. Experimental results show that the model trained on raw data consistently outperforms the image-based model. We provide a thorough analysis of both approaches across hyperparameters related to model pretraining and the size of the training dataset. This, in conclusion, shows how processing raw data provides overall better classification accuracy, it is inherently faster since there is no need for image reconstruction and it is therefore useful tool in industrial GBSAR applications where processing speed is critical.

show abstract

mentioning

confidence: 60%

Deep Learning Approach for Object Classification on Raw and Reconstructed GBSAR Data

et al. 2022

View full text Add to dashboard Cite

show abstract

“…There are factors known to impact transfer that we could not test for PLMs due to a lack of public models or to computational expense. First, pretraining dataset is important, both in terms of distance between the pretraining and downstream task data domains (Cherti & Jitsev, 2022) and data size (Abnar et al, 2022). PLMs pretrain on large databases of natural sequences.…”

Section: Discussionmentioning

confidence: 99%

“…Finally, we only test linear probes on mean pooled representations to limit computational cost, but previous work shows that for many tasks finetuning the PLM end-to-end outperforms a linear probe or training a small neural net-work on top of the frozen pretrained weights (Dallago et al, 2021; Yang et al, 2022), and that mean-pooling is rarely optimal (Detlefsen et al, 2022; Goldman et al, 2022). In computer vision, models trained on different datasets (Cherti & Jitsev, 2022) and pretraining tasks (Grigg et al, 2021) exhibit different finetuning dynamics, and there is some evidence for this in proteins as well (Detlefsen et al, 2022).…”

Section: Discussionmentioning

confidence: 99%

Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models

Li,

Amini,

Yue

et al. 2024

Preprint

View full text Add to dashboard Cite

Large pretrained protein language models (PLMs) have improved protein property and structure prediction from sequences via transfer learning, in which weights and representations from PLMs are repurposed for downstream tasks. Although PLMs have shown great promise, currently there is little understanding of how the features learned by pretraining relate to and are useful for downstream tasks. We perform a systematic analysis of transfer learning using PLMs, conducting 370 experiments across a comprehensive suite of factors including different downstream tasks, architectures, model sizes, model depths, and pretraining time. We observe that while almost all downstream tasks do benefit from pretrained models compared to naive sequence representations, for the majority of tasks performance does not scale with pretraining, and instead relies on low-level features learned early in pretraining. Our results point to a mismatch between current PLM pretraining paradigms and most applications of these models, indicating a need for better pretraining methods.

show abstract

“…We measured the perplexity of each prompt using a surrogate language model -the underlying hypothesis here being that less likely prompts occur less frequently (if at all) in the training data of the surrogate language model, and will thus incur higher perplexity. Comparing the perplexity scores with the actual frequencies of prompt tokens in LAION-5B (Schuhmann et al, 2022) is an interesting avenue for future work. Figure 3 displays scatter plots of the intrinsic dimension and perplexity of these prompts.…”

Section: Methodsmentioning

confidence: 99%

Exploring the Representation Manifolds of Stable Diffusion Through the Lens of Intrinsic Dimension

Kvinge¹,

Brown²,

Godfrey³

2023

Preprint

View full text Add to dashboard Cite

Prompting has become an important mechanism by which users can more effectively interact with many flavors of foundation model. Indeed, the last several years have shown that well-honed prompts can sometimes unlock emergent capabilities within such models. While there has been a substantial amount of empirical exploration of prompting within the community, relatively few works have studied prompting at a mathematical level. In this work we aim to take a first step towards understanding basic geometric properties induced by prompts in Stable Diffusion, focusing on the intrinsic dimension of internal representations within the model. We find that choice of prompt has a substantial impact on the intrinsic dimension of representations at both layers of the model which we explored, but that the nature of this impact depends on the layer being considered. For example, in certain bottleneck layers of the model, intrinsic dimension of representations is correlated with prompt perplexity (measured using a surrogate model), while this correlation is not apparent in the latent layers. Our evidence suggests that intrinsic dimension could be a useful tool for future studies of the impact of different prompts on text-to-image models.

show abstract

Effect of pre-training scale on intra- and inter-domain, full and few-shot transfer learning for natural and X-Ray chest images

Cited by 14 publications

References 21 publications

Deep Learning Approach for Object Classification on Raw and Reconstructed GBSAR Data

Deep Learning Approach for Object Classification on Raw and Reconstructed GBSAR Data

Feature Reuse and Scaling: Understanding Transfer Learning with Protein Language Models

Exploring the Representation Manifolds of Stable Diffusion Through the Lens of Intrinsic Dimension

Contact Info

Product

Resources

About