Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413593
|View full text |Cite
|
Sign up to set email alerts
|

Context-aware Feature Generation For Zero-shot Semantic Segmentation

Abstract: Existing semantic segmentation models heavily rely on dense pixelwise annotations. To reduce the annotation pressure, we focus on a challenging task named zero-shot semantic segmentation, which aims to segment unseen objects with zero annotations. This task can be accomplished by transferring knowledge across categories via semantic word embeddings. In this paper, we propose a novel context-aware feature generation method for zero-shot segmentation named CaGNet. In particular, with the observation that a pixel… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
86
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 80 publications
(86 citation statements)
references
References 42 publications
0
86
0
Order By: Relevance
“…Text embeddings of class labels play a central role in these works. Bucher et al (2019) and Gu et al (2020) propose to leverage word embeddings together with a generative model to generate visual features of unseen categories, while Xian et al (2019) propose to project visual features into a simple word embedding space and to correlate the resulting embeddings to assign a label to a pixel. propose to use uncertainty-aware learning to better handle noisy labels of seen classes, while introduce a structured learning approach to better exploit the relations between seen and unseen categories.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Text embeddings of class labels play a central role in these works. Bucher et al (2019) and Gu et al (2020) propose to leverage word embeddings together with a generative model to generate visual features of unseen categories, while Xian et al (2019) propose to project visual features into a simple word embedding space and to correlate the resulting embeddings to assign a label to a pixel. propose to use uncertainty-aware learning to better handle noisy labels of seen classes, while introduce a structured learning approach to better exploit the relations between seen and unseen categories.…”
Section: Related Workmentioning
confidence: 99%
“…However, these approaches still require labeled data that includes the novel classes in order to facilitate transfer. Zero-shot methods, on the other hand, commonly leverage word embeddings to discover or generate related features between seen and unseen classes (Bucher et al, 2019;Gu et al, 2020) without the need for additional annotations. Existing works in this space use standard word embeddings (Mikolov et al, 2013) and focus on the image encoder.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, multi-scale feature fusion has achieved remarkable success in many computer vision fields like object detection [22], salient object detection [7,4], instance segmentation [23]. However, previous methods mainly fused multiscale features only in the encoder [22,24] or only in the decoder [7,25]. Besides, most of them did not consider the induced redundant information when integrating multi-scale features.…”
Section: Multi-scale Feature Fusionmentioning
confidence: 99%
“…in [12], without even knowing the number of classes a-priori. Related to this, the intention of zero-shot segmentation is to segment non-annotated objects that have not previously been seen by a neural network, as in [23,24], which, however, do not utilize with scribbles.…”
Section: Related Workmentioning
confidence: 99%