Hongshuo Tian scite author profile

Generating scene graph to describe the whereabouts and interactions of objects in an image has attracted increasing attention of researchers. Most existing methods explore object-level visual context or bodypart-object cooperation with the message passing structure, which can not meet the part-aware interaction nature of scene graph. Normally, a subject interacts with an object through crucial parts in each other. Besides, the correlation among parts within an identical object can also help predicting objects and their relationships. Hence, both of subject and object parts and their intra-and inter-object correlations should be fully considered for scene graph generation. In this paper, we propose a part-aware interactive learning method, which are divided into the intra-object and inter-object scenarios. First, we detect objects from an image and further decompose each one into a set of parts. Second, the part-aware graph attention module is proposed to refine part features via the intra-object message passing, and the refined features are incorporated for object inference. Third, the visual mutual attention module is designed to discover part-aware correlated visual cues precisely for predicate inference. It can highlight the subjectrelated object parts and the object-related subject parts during inter-object interactive learning. We demonstrate the superiority of our method against the state of the arts on Visual Genome. Ablation studies and visualization further validate its effectiveness. CCS CONCEPTS • Computing methodologies → Scene understanding.

show abstract

Counterfactual Visual Dialog: Robust Commonsense Knowledge Learning From Unbiased Training

Liu

Huang

et al. 2024

IEEE Trans. Multimedia

View full text Add to dashboard Cite

Mask and Predict

Tian

Liu

et al. 2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Hongshuo Tian

Toward Region-Aware Attention Learning for Scene Graph Generation

Part-Aware Interactive Learning for Scene Graph Generation

Counterfactual Visual Dialog: Robust Commonsense Knowledge Learning From Unbiased Training

Mask and Predict

Contact Info

Product

Resources

About