2022
DOI: 10.48550/arxiv.2202.03278
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Crafting Better Contrastive Views for Siamese Representation Learning

Abstract: Recent self-supervised contrastive learning methods greatly benefit from the Siamese structure that aims at minimizing distances between positive pairs. For high performance Siamese representation learning, one of the keys is to design good contrastive pairs. Most previous works simply apply random sampling to make different crops of the same image, which overlooks the semantic information that may degrade the quality of views. In this work, we propose ContrastiveCrop, which could effectively generate better c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 37 publications
0
8
0
Order By: Relevance
“…It is also not logical to recover the token of the main object that is masked by the token of the unmasked background and other objects at this point. Therefore, we introduce ContrastiveCrop [24] to perform data augmentation on the input images in MIM. Since contrastive learning draws the positive case samples closer and pushes the negative case samples farther in the high-dimensional space.…”
Section: Data Augmentationmentioning
confidence: 99%
See 1 more Smart Citation
“…It is also not logical to recover the token of the main object that is masked by the token of the unmasked background and other objects at this point. Therefore, we introduce ContrastiveCrop [24] to perform data augmentation on the input images in MIM. Since contrastive learning draws the positive case samples closer and pushes the negative case samples farther in the high-dimensional space.…”
Section: Data Augmentationmentioning
confidence: 99%
“…When the proportion of primary objects is relatively small or the unmasked tokens are mainly primary objects, the reconstruction of background or other objects is not meaningful for the encoder to learn the primary object features. Also, to perform the contrastive learning task better, we introduce ContrastiveCrop [24] to process the input images. It helps to crop the primary object area by positioning the primary object with the semantic information extracted by the encoder, thus minimizing the interference of background and secondary objects.…”
Section: Introductionmentioning
confidence: 99%
“…Nonetheless, the widely-used attention-guided Center Zoom mechanism only zooms in on the whole suspicious area, which increases the difficulties of feature decomposition and the probability of ignoring subtle regions [18], [43]. We delicately designed our CMZ strategy for reducing the intersection of the local-branch inputs inspired by Contrastive Crop [44], which is proposed first in self-supervised learning (SSL) for data augmentation.…”
Section: B Weakly-supervised Lesion Localization Modulementioning
confidence: 99%
“…In [14] the authors discuss challenges for multi-factor datasets in the context of natural images and propose using a subset of labels during pretraining to selectively tune for desirable factors. Another recent work [28] proposes initializing views randomly but using approximate heatmaps from higher level convolutional layers as guidance for curating better views adding a little computation overhead.…”
Section: Related Workmentioning
confidence: 99%