Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence 2021
DOI: 10.24963/ijcai.2021/136
|View full text |Cite
|
Sign up to set email alerts
|

Learning Visual Words for Weakly-Supervised Semantic Segmentation

Abstract: Current weakly-supervised semantic segmentation (WSSS) methods with image-level labels mainly adopt class activation maps (CAM) to generate the initial pseudo labels. However, CAM usually only identifies the most discriminative object extents, which is attributed to the fact that the network doesn't need to discover the integral object to recognize image-level labels. In this work, to tackle this problem, we proposed to simultaneously learn the image-level labels and local visual word labels. Specifically, in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
22
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 16 publications
(22 citation statements)
references
References 16 publications
(13 reference statements)
0
22
0
Order By: Relevance
“…CAMs +RW PSA CVPR'2018 [2] WR38 48.0 61.0 SC-CAM CVPR'2020 [4] WR38 50.9 63.4 SEAM CVPR'2020 [31] WR38 55.4 63.6 PuzzleCAM ICIP'2021 [13] R50 51.5 64.7 VWE IJCAI'2021 [26] R50 52.9 -AdvCAM CVPR'2021 [18] R50 55.6 68.0 CLIMS (Ours) R50 56.6 70.5…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…CAMs +RW PSA CVPR'2018 [2] WR38 48.0 61.0 SC-CAM CVPR'2020 [4] WR38 50.9 63.4 SEAM CVPR'2020 [31] WR38 55.4 63.6 PuzzleCAM ICIP'2021 [13] R50 51.5 64.7 VWE IJCAI'2021 [26] R50 52.9 -AdvCAM CVPR'2021 [18] R50 55.6 68.0 CLIMS (Ours) R50 56.6 70.5…”
Section: Methodsmentioning
confidence: 99%
“…IAL IJCV'20 [30] V2 -64.3 65.4 SEAM CVPR'20 [31] V3 WR38 64.5 65.7 BES ECCV'20 [7] V2 R50 65.7 66.6 SC-CAM CVPR'20 [4] V2 ‡ WR38 66.1 65.9 CONTA NeurIPS'20 [37] V3 WR38 66.1 66.7 A 2 GNN TPAMI'21 [36] V2 WR38 66.8 67.4 VWE IJCAI'2021 [26] V2 ‡ R50 67.2 67.3 AdvCAM CVPR'21 [18] V2 R50 68.1 68.0 Kweon et al ICCV'21 [17] Segmentation Network. Given pseudo ground-truth masks, we follow VWE [26], SC-CAM [4] and Adv-CAM [18] to adopt DeepLabV2 with ResNet-101 [10] as the segmentation network. For experiments on PASCAL VOC2012 dataset, we follow the default setting of deeplabpytorch toolkit † to train DeepLabV2 with weights pretrained using MS COCO dataset.…”
mentioning
confidence: 99%
“…This paper is an improved version of our preliminary work (Ru et al, 2021). Compared with the conference version, this work further improves the learning-based strategy and proposes the memory-bank strategy which could learn visual words better.…”
Section: Image Ours Camsmentioning
confidence: 99%
“…In Fig. 6, we visualize the generated CAMs and compare them with the results of recent methods, including IRNet (Ahn et al, 2019), SEAM (Wang et al, 2020b), and VWE (our previous work with HP and simple visual words encoder) (Ru et al, 2021). The results of the learning-based strategy (Ours-L) and the memory-bank strategy (Ours-M) are both presented.…”
Section: Imagementioning
confidence: 99%
See 1 more Smart Citation