2021
DOI: 10.48550/arxiv.2107.02112
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Recovering the Unbiased Scene Graphs from the Biased Ones

Abstract: Given input images, scene graph generation (SGG) aims to produce comprehensive, graphical representations describing visual relationships among salient objects. Recently, more efforts have been paid to the long tail problem in SGG; however, the imbalance in the fraction of missing labels of different classes, or reporting bias, exacerbating the long tail is rarely considered and cannot be solved by the existing debiasing methods. In this paper we show that, due to the missing labels, SGG can be viewed as a "Le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(5 citation statements)
references
References 55 publications
0
5
0
Order By: Relevance
“…On dense relationship proposals, some works propose a contextual modeling structure [11, 19-21, 23, 27, 33, 38, 39, 41, 42, 45, 50-55]. Recent studies have focused on developing logit adjustment and training strategies to address the SGG task's long-tail recognition [1,4,5,7,14,15,19,29,32,38,43,44,48]. The twostage design is capable of dealing with the complicated scenarios encountered in SGG.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…On dense relationship proposals, some works propose a contextual modeling structure [11, 19-21, 23, 27, 33, 38, 39, 41, 42, 45, 50-55]. Recent studies have focused on developing logit adjustment and training strategies to address the SGG task's long-tail recognition [1,4,5,7,14,15,19,29,32,38,43,44,48]. The twostage design is capable of dealing with the complicated scenarios encountered in SGG.…”
Section: Related Workmentioning
confidence: 99%
“…The two components of the total cost correspond to the cost of the predicate, subject, and object entity, respectively. 4 The matching index I tri between the triplet prediction and the ground truth is produced by: I tri = argmin T ,T gt C, which is used for the following loss calculation of the predicate node generator. The two terms of L pre , that is, L pre i , L pre p are used to supervise two types of sub-decoder in predicate node generator.…”
Section: Learning and Inferencementioning
confidence: 99%
See 3 more Smart Citations