Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413575
|View full text |Cite
|
Sign up to set email alerts
|

HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation

Abstract: Scene graph generation aims to produce structured representations for images, which requires to understand the relations between objects. Due to the continuous nature of deep neural networks, the prediction of scene graphs is divided into object detection and relation classification. However, the independent relation classes cannot separate the visual features well. Although some methods organize the visual features into graph structures and use message passing to learn contextual information, they still suffe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 31 publications
0
7
0
Order By: Relevance
“…This approach controls the weight on different sample types, diminishing the sideeffect of the negative samples on prediction. Wei et al [186] proposed HOSENet to concentrate on the semantic overlap between the objects, by building two object sets (L s ,L o ) and calculating the similarity between subject and object within a triplet. Wang et al [195] believed that larger targets represent major relationship positions in a scene, thus they built a tree-like structure and a Relation Ranking Module (RRM) to pay more attention to the relationships and objects within the salient region.…”
Section: Other Sgg Methodsmentioning
confidence: 99%
“…This approach controls the weight on different sample types, diminishing the sideeffect of the negative samples on prediction. Wei et al [186] proposed HOSENet to concentrate on the semantic overlap between the objects, by building two object sets (L s ,L o ) and calculating the similarity between subject and object within a triplet. Wang et al [195] believed that larger targets represent major relationship positions in a scene, thus they built a tree-like structure and a Relation Ranking Module (RRM) to pay more attention to the relationships and objects within the salient region.…”
Section: Other Sgg Methodsmentioning
confidence: 99%
“…Scene graph generation localizes not only objects but also recognizes their relationships, which is a visual task with higher semantic abstraction. As a structural representation of images, the objects and their semantic relationships are represented as nodes and edges in the scene graph [5,[7][8][9]. There are a series of triples, <subject-predicate-object>, in the scene graph, "predicate" represents a specific semantic relationship, and "subject" and "object" are the two instances involved, as shown in Figure 1c.…”
Section: Introduction 1background Of Scene Graph Generationmentioning
confidence: 99%
“…There are a series of triples, <subject-predicate-object>, in the scene graph, "predicate" represents a specific semantic relationship, and "subject" and "object" are the two instances involved, as shown in Figure 1c. As a bridge between low-level recognition and high-level understanding of images, the scene graph supports many downstream vision tasks [8][9][10][11][12], such as image captioning [13,14] and visual question answering [15,16].…”
Section: Introduction 1background Of Scene Graph Generationmentioning
confidence: 99%
“…Most of the existing efforts in long-tailed SGG [6,14,37,41,43,45] deal with the skewed class distribution directly. However, 1 https://github.com/coldmanck/recovering-unbiased-scene-graphs Traditionally SGG models are not trained in the PU setting and thus output biased probabilities in favor of conspicuous classes (e.g., on).…”
Section: Introductionmentioning
confidence: 99%
“…To produce meaningful scene graphs, the inconspicuous but informative predicates need to be properly predicted. To the best of our knowledge, none of the existing SGG debiasing methods [6,14,37,41,43,45] effectively solve this reporting bias problem.…”
Section: Introductionmentioning
confidence: 99%