2022
DOI: 10.1109/lgrs.2021.3069799
|View full text |Cite
|
Sign up to set email alerts
|

Contrastive Self-Supervised Learning With Smoothed Representation for Remote Sensing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
27
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 42 publications
(27 citation statements)
references
References 16 publications
0
27
0
Order By: Relevance
“…Most existing methods apply InfoNCE loss [18, 22, 24-28, 30, 33, 62] or triple loss [17,60] on the constructed positive and negative pairs. Positive samples can be obtained by different artificial augmentations (e.g., color and geometric transformations) of the same image [25,28], spatial augmentations (.i.e., geospatially overlapped images) [24,27,60], temporal augmentations (i.e., multi-temporal co-registered images) [16,17,22,26,62], and modality augmentations (e.g., optical image, SAR, and semantic mask) [29,63]. Negative pairs can be different samples in a minibatch or spatially distinct images [60,62].…”
Section: Semantic Dissimilaritymentioning
confidence: 99%
See 1 more Smart Citation
“…Most existing methods apply InfoNCE loss [18, 22, 24-28, 30, 33, 62] or triple loss [17,60] on the constructed positive and negative pairs. Positive samples can be obtained by different artificial augmentations (e.g., color and geometric transformations) of the same image [25,28], spatial augmentations (.i.e., geospatially overlapped images) [24,27,60], temporal augmentations (i.e., multi-temporal co-registered images) [16,17,22,26,62], and modality augmentations (e.g., optical image, SAR, and semantic mask) [29,63]. Negative pairs can be different samples in a minibatch or spatially distinct images [60,62].…”
Section: Semantic Dissimilaritymentioning
confidence: 99%
“…Contrastive SSL [20,21] could learn useful representations from massive unlabeled data by pulling together representations of semantically similar samples (i.e., positive pairs) and pushing away those of dissimilar samples (i.e., negative pairs). Very recently, contrastive methods have been introduced in the RS domain [16][17][18][22][23][24][25][26][27][28][29][30][31][32][33] and have shown promising performance for the downstream supervised CD task [16][17][18].…”
Section: Introductionmentioning
confidence: 99%
“…This task is illustrated in Figure 3 and can be formalized as x k + = aug(sample_neighbor(x q )), where aug is the same as above and sample_neighbor generates a geographically close patch. Inspired by [26], this strategy aims to help the network to better cluster together similar regions (land, water bodies, etc.). The maximum distance can be varied to control the average overlap of sampled patches.…”
Section: Pretext Task Settingsmentioning
confidence: 99%
“…Unlike RS, in the computer vision (CV) community, unsupervised and in particular self-supervised cross-modal representation learning methods (which only rely on the alignments between modalities) are widely studied [8][9][10][11][12][13]. As an example, in [9] a deep jointsemantics reconstructing hashing (DJSRH) method is introduced to learn binary codes that preserve the neighborhood structure in the original data.…”
Section: Introductionmentioning
confidence: 99%