2021
DOI: 10.3847/2041-8213/abf2c7
|View full text |Cite
|
Sign up to set email alerts
|

Self-supervised Representation Learning for Astronomical Images

Abstract: Sky surveys are the largest data generators in astronomy, making automated tools for extracting meaningful scientific information an absolute necessity. We show that, without the need for labels, self-supervised learning recovers representations of sky survey images that are semantically useful for a variety of scientific tasks. These representations can be directly used as features, or fine-tuned, to outperform supervised methods trained only on labeled data. We apply a contrastive learning framework on multi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
40
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 46 publications
(42 citation statements)
references
References 52 publications
0
40
0
Order By: Relevance
“…Self-supervised learning We closely follow [9] in designing the architecture and training procedure for our self-supervised model, which is based on MoCov2 [4]. In this setting, the backbone of the model is a CNN encoder that takes an image x as input and produces a lower dimensional representation z.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Self-supervised learning We closely follow [9] in designing the architecture and training procedure for our self-supervised model, which is based on MoCov2 [4]. In this setting, the backbone of the model is a CNN encoder that takes an image x as input and produces a lower dimensional representation z.…”
Section: Methodsmentioning
confidence: 99%
“…The encoder learns to make meaningful representations by associating augmented views of the same image as similar, and views of different images as dissimilar, via a contrastive loss function. We use the same ResNet50 network and training hyperparameters as [9], but increase the queue length to K = 262, 144 to accommodate our larger training set. We choose the following set of augmentations, applying each of them in succession to images during pre-training with the order listed below [see 16, for motivation]:…”
Section: Methodsmentioning
confidence: 99%
“…In contrast, as a new paradigm between unsupervised and supervised learning, SSL can generate labels based on the property of unlabeled data itself to train the neural network in a supervised manner similar to natural learning experiences. With excellent performance on representation learning and dealing with the issue of unlabelled data, SSL [20][21][22] has been successfully implemented in a wide range of fields, including image recognition 23 , audio representation 24 , computer vision 25 , document reconstruction 26 , atmosphere 27 , astronomy 28 , medical 29 , person re-identification 30 , remote sensing 31 , robotics 32 , omnidirectional imaging 33 , manufacturing 34 , nano-photonics 35 , and civil engineering 36 , etc. However, this method has not been formally attempted in material science.…”
Section: High-efficient Low-cost Characterization Of Materials Properties Using Domain-knowledge-guided Selfsupervised Learningmentioning
confidence: 99%
“…The representations for different images are maintained in a queue during training, which extends the number of contrasting examples available to the model at each training step beyond just those available in a given minibatch. More details on this approach can be found in Hayat et al (2021) and Chen, X. et al (2020). We use the same ResNet50 network and training hyperparameters as Hayat et al (2021), but increase the queue length to K = 262, 144 to accommodate our larger training set.…”
Section: Self-supervised Pre-training On Unlabeled Imagesmentioning
confidence: 99%
“…The utility of self-supervised models applied to astronomical imagery has been recently showcased by Hayat et al (2021), who find that self-supervised pretraining on a large unlabeled set of SDSS galaxy images improves performance on tasks like redshift estimation and morphology classification, and that these performance gains are most significant when the number of labels for supervised training is limited. Hayat et al (2021) also show that the representation space learned in self-supervised pretraining is semantically meaningful, and readily provides a similarity metric that can identify additional examples of a query object, such as an observational error or anomalous galaxy. As automated strong lens detection is a problem inherently limited by the number of labeled examples, self-supervised models are thus an exciting prospect for quickly identifying new candidates given a large set of galaxy imagery.…”
Section: Introductionmentioning
confidence: 99%