2020
DOI: 10.1609/aaai.v34i07.6996
|View full text |Cite
|
Sign up to set email alerts
|

GTNet: Generative Transfer Network for Zero-Shot Object Detection

Abstract: We propose a Generative Transfer Network (GTNet) for zero-shot object detection (ZSD). GTNet consists of an Object Detection Module and a Knowledge Transfer Module. The Object Detection Module can learn large-scale seen domain knowledge. The Knowledge Transfer Module leverages a feature synthesizer to generate unseen class features, which are applied to train a new classification layer for the Object Detection Module. In order to synthesize features for each unseen class with both the intra-class variance and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
40
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 34 publications
(40 citation statements)
references
References 24 publications
(55 reference statements)
0
40
0
Order By: Relevance
“…Zero-shot object detection. ZSD receives great research interest in recent years [4,7,16,22,31,32,[41][42][43]. Some researches focus on embedding function-based methods [4,7,22,31,32,42].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Zero-shot object detection. ZSD receives great research interest in recent years [4,7,16,22,31,32,[41][42][43]. Some researches focus on embedding function-based methods [4,7,22,31,32,42].…”
Section: Related Workmentioning
confidence: 99%
“…Although some existing approaches have recognized the importance of intra-class diversity [16,41], without jointly considering the inter-class separability, these methods would either impose insufficient diversity to the synthesized visual features, leading to miss-classify the real unseen objects as image backgrounds (see Fig To overcome the feature synthesizing problems toward real-world detection scenarios, we build a novel zero-shot object detection framework as shown in Fig 2 . Specifically, we design two components for learning robust region features. To enable the model to synthesize diverse visual features, we propose an Intra-class Semantic Diverging (In-traSD) component which can diverge the semantic vector of a single class into a set of visual features.…”
Section: Introductionmentioning
confidence: 99%
“…In order to prove that our transformer encoder and decoder is more effective than the previous methods in recalling unseen objects, we train a RPN (the region proposal network) for comparison. We choose RPN here because that previous methods [7,10,12] have used it as unseen objects extractor. We use the same dataset and same backbone network with our ZSDTR to train the RPN and add the FPN [18] in it for fair comparison.…”
Section: Effective Study On Transformer Bodymentioning
confidence: 99%
“…To remedy this problem, Bansal et al [6] and Rahman et al [7] first propose the definition of ZSD task, which aims to recognize and localize the unseen objects simultaneously. After that, several methods have been proposed to improve the performance for ZSD [8,9,10,11,12,13]. These methods use additional information like the attribute or the semantic word-vectors as extra knowledge to detect unseen objects, and we also use the semantic word-vectors in our method.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation