“…Most existing methods apply InfoNCE loss [18, 22, 24-28, 30, 33, 62] or triple loss [17,60] on the constructed positive and negative pairs. Positive samples can be obtained by different artificial augmentations (e.g., color and geometric transformations) of the same image [25,28], spatial augmentations (.i.e., geospatially overlapped images) [24,27,60], temporal augmentations (i.e., multi-temporal co-registered images) [16,17,22,26,62], and modality augmentations (e.g., optical image, SAR, and semantic mask) [29,63]. Negative pairs can be different samples in a minibatch or spatially distinct images [60,62].…”