Proceedings of the 28th ACM International Conference on Multimedia 2020
DOI: 10.1145/3394171.3413501
|View full text |Cite
|
Sign up to set email alerts
|

Part-Aware Interactive Learning for Scene Graph Generation

Abstract: Generating scene graph to describe the whereabouts and interactions of objects in an image has attracted increasing attention of researchers. Most existing methods explore object-level visual context or bodypart-object cooperation with the message passing structure, which can not meet the part-aware interaction nature of scene graph. Normally, a subject interacts with an object through crucial parts in each other. Besides, the correlation among parts within an identical object can also help predicting objects … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 39 publications
(60 reference statements)
0
2
0
Order By: Relevance
“…When modeling the interaction of inter-actor parts, we need to explore the mutually important and relevant features among parts of different individuals. To solve this problem, we learn from the mutual attention mechanism [ 47 , 48 ], which is originally introduced in social networks. The mutual attention mechanism is also used to calculate the edge weight vector in the visual graph in our work.…”
Section: Methodsmentioning
confidence: 99%
“…When modeling the interaction of inter-actor parts, we need to explore the mutually important and relevant features among parts of different individuals. To solve this problem, we learn from the mutual attention mechanism [ 47 , 48 ], which is originally introduced in social networks. The mutual attention mechanism is also used to calculate the edge weight vector in the visual graph in our work.…”
Section: Methodsmentioning
confidence: 99%
“…They used the Naive Bayes classifier, K-Nearest Neighbor Algorithm, Back Propagation Neural Network Algorithm, C4.5, and Support Vector Machines as classification techniques. A proposed cost-effective mobile phone-based device solution that is both cost-effective and noise-resistant (Tian et al 2021). They used features such as Local Binary Pattern (LBP), Gabor, and Histogram-based features to differentiate between different types of obstacles.…”
Section: Literature Reviewmentioning
confidence: 99%
“…For understanding beyond standard labels, several papers have investigated the automatic learning and decomposition of tasks from instructional videos [1,39,53]. Other work seeks to uncover the scene graph and the interactions of a video, where nodes denote objects and edges denote activities [26,31,36,58,63,66,68]. We also aim to learn from long videos beyond standard labels.…”
Section: Multimodal Video Understandingmentioning
confidence: 99%