Generating scene graph to describe the whereabouts and interactions of objects in an image has attracted increasing attention of researchers. Most existing methods explore object-level visual context or bodypart-object cooperation with the message passing structure, which can not meet the part-aware interaction nature of scene graph. Normally, a subject interacts with an object through crucial parts in each other. Besides, the correlation among parts within an identical object can also help predicting objects and their relationships. Hence, both of subject and object parts and their intra-and inter-object correlations should be fully considered for scene graph generation. In this paper, we propose a part-aware interactive learning method, which are divided into the intra-object and inter-object scenarios. First, we detect objects from an image and further decompose each one into a set of parts. Second, the part-aware graph attention module is proposed to refine part features via the intra-object message passing, and the refined features are incorporated for object inference. Third, the visual mutual attention module is designed to discover part-aware correlated visual cues precisely for predicate inference. It can highlight the subjectrelated object parts and the object-related subject parts during inter-object interactive learning. We demonstrate the superiority of our method against the state of the arts on Visual Genome. Ablation studies and visualization further validate its effectiveness. CCS CONCEPTS • Computing methodologies → Scene understanding.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.