2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.01131
|View full text |Cite
|
Sign up to set email alerts
|

ContrastMask: Contrastive Learning to Segment Every Thing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 31 publications
(10 citation statements)
references
References 37 publications
0
10
0
Order By: Relevance
“…Recently, the open-set problem has been explored in various computer vision tasks (Bendale and Boult 2015;Dhamija et al 2020;Joseph et al 2021;Gupta et al 2022;Zhao et al 2022;Vaze et al 2021;Qi et al 2021;Saito et al 2021;Wang et al 2022b;Hwang et al 2021;Wang et al 2022aWang et al , 2021. Dhamija et al (Dhamija et al 2020) first formal-ize the open-set object detection problem and propose the open-set object detection protocol to better estimate the performance under real-world conditions.…”
Section: Open-set Detection and Segmentationmentioning
confidence: 99%
“…Recently, the open-set problem has been explored in various computer vision tasks (Bendale and Boult 2015;Dhamija et al 2020;Joseph et al 2021;Gupta et al 2022;Zhao et al 2022;Vaze et al 2021;Qi et al 2021;Saito et al 2021;Wang et al 2022b;Hwang et al 2021;Wang et al 2022aWang et al , 2021. Dhamija et al (Dhamija et al 2020) first formal-ize the open-set object detection problem and propose the open-set object detection protocol to better estimate the performance under real-world conditions.…”
Section: Open-set Detection and Segmentationmentioning
confidence: 99%
“…The MAE [6] is a type of neural network architecture that learns to reconstruct its input data while selectively ignoring or masking certain parts of the input, typically used for feature learning and dimensionality reduction. Masked autoencoders can be successful in various computer vision tasks [39,42,45,46], such as image segmentation, tracking, and generation, because they can capture and represent meaningful features in images by learning to reconstruct them while ignoring noisy or less relevant information. The proposed BTMAE can effectively enhance RIS performance by leveraging the feature representation modeling capability of MAE to learn complex contextual relationships between modalities in multimodal spaces.…”
Section: Related Workmentioning
confidence: 99%
“…One promising solution to this problem is to leverage abundant unlabeled data through semi-supervised learning (SSL), which have been actively explored in medical segmentation tasks (Luo et al 2021a;You et al 2023;Peiris et al 2023;Zhu et al 2021). Existing SSL methods typically leverage knowledge leart from a few labeled images through pseudo labeling, and then enhance representation learning through consistency regularization on unlabeled data (Luo et al 2021b;Wang et al 2022;You et al 2022;Zhou et al 2019). However, as pointed out in (Wang, Li, and Gool 2019), in practical scenarios, the distribution of labeled data often more or less deviates from the true distribution of the real-world dataset, which is caused by the small sample size and the randomness in sampling.…”
Section: Introductionmentioning
confidence: 99%