Visual Graphs from Motion (VGfM): Scene Understanding with Object Geometry Reasoning

James, Stuart; Bue, Alessio Del

doi:10.1007/978-3-030-20893-6_21

Cited by 20 publications

(24 citation statements)

References 38 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Only a few works have explored scene graphs in 3D. Gay et al (2018) propose a 2.5D graph dataset based on ScanNet (Dai et al 2017), Armeni et al (2019), on the other hand, suggest hierarchical 3D scene graphs. They split the different components of a scene into 4 different layers: cameras, objects, buildings and rooms.…”

Section: D Object Context and Scene Layoutmentioning

confidence: 99%

“…Additionally to the data, Armeni et al (2019) and Gay et al (2018) propose graph prediction methods. Armeni et al (2019) sample images from a panoramic camera and apply a regularization technique to 2D mask predictions aiming to obtain improved 3D object nodes.…”

Section: D Object Context and Scene Layoutmentioning

confidence: 99%

“…Although, 3D graphs have been used in computer graphics for decades to store 3D mesh data, the respective edges usually do not represent semantic connections but rather relative transformations such that when a parent node is relocated, the change is applied in a hierarchical fashion to all child nodes. Only recently, semantic scene graphs have started to emerge in the 3D context (Gay et al 2018;Armeni et al 2019;Rosinol et al 2020b). Armeni et al construct graphs for buildings, including rooms, major objects, camera views and the relations between these entities (Armeni et al 2019).…”

Section: Introductionmentioning

confidence: 99%

“…Rosinol et al (2020b) incorporates dynamics to this representation by additionally considering moving humans. Both (Armeni et al 2019) and (Gay et al 2018) propose multiview graph prediction methods based on 2D masks (Armeni et al 2019) and object detection networks (Gay et al 2018). They estimate graphs from images while we operate on 3D data directly.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Learning 3D Semantic Scene Graphs with Instance Embeddings

2022

View full text Add to dashboard Cite

A 3D scene is more than the geometry and classes of the objects it comprises. An essential aspect beyond object-level perception is the scene context, described as a dense semantic network of interconnected nodes. Scene graphs have become a common representation to encode the semantic richness of images, where nodes in the graph are object entities connected by edges, so-called relationships. Such graphs have been shown to be useful in achieving state-of-the-art performance in image captioning, visual question answering and image generation or editing. While scene graph prediction methods so far focused on images, we propose instead a novel neural network architecture for 3D data, where the aim is to learn to regress semantic graphs from a given 3D scene. With this work, we go beyond object-level perception, by exploring relations between object entities. Our method learns instance embeddings alongside a scene segmentation and is able to predict semantics for object nodes and edges. We leverage 3DSSG, a large scale dataset based on 3RScan that features scene graphs of changing 3D scenes. Finally, we show the effectiveness of graphs as an intermediate representation on a retrieval task.

show abstract

Section: D Object Context and Scene Layoutmentioning

confidence: 99%

Section: D Object Context and Scene Layoutmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning 3D Semantic Scene Graphs with Instance Embeddings

2022

View full text Add to dashboard Cite

show abstract

“…Additionally, we integrate the results of classifiers by using rule-based architecture for estimating multi-modal information. This rule-based integration architecture is inspired by the methods which utilize structured knowledge to learn from small amounts of experience for scene understanding [7]. Fig.…”

Section: Introduction mentioning

confidence: 99%

A System of Associated Intelligent Integration for Human State Estimation

Matsufuji¹,

Hsieh²,

Sato-Shimokawara³

et al. 2019

JMEA

View full text Add to dashboard Cite

We propose a learning architecture for integrating multi-modal information e.g., vision, audio information. In recent years, artificial intelligence (AI) is making major progress in key tasks like a language, vision, voice recognition tasks. Most studies focus on how AI could achieve human-like abilities. Especially, in human-robot interaction research field, some researchers attempt to make robots talk with human in daily life. The key challenges for making robots talk naturally in conversation are to need to consider multi-modal non-verbal information same as human, and to learn with small amount of labeled multi-modal data. Previous multi-modal learning needs a large amount of labeled data while labeled multi-modal data are shortage and difficult to be collected. In this research, we address these challenges by integrating single-modal classifiers which trained each modal information respectively. Our architecture utilized knowledge by using bi-directional associative memory. Furthermore, we conducted the conversation experiment for collecting multi-modal non-verbal information. We verify our approach by comparing accuracies between our system and conventional system which trained multi-modal information.

show abstract

A Survey on 3D Scene Graphs: Definition, Generation and Application

Bae

Shin

et al. 2023

Lecture Notes in Networks and Systems

View full text Add to dashboard Cite

Visual Graphs from Motion (VGfM): Scene Understanding with Object Geometry Reasoning

Cited by 20 publications

References 38 publications

Learning 3D Semantic Scene Graphs with Instance Embeddings

Learning 3D Semantic Scene Graphs with Instance Embeddings

A System of Associated Intelligent Integration for Human State Estimation

A Survey on 3D Scene Graphs: Definition, Generation and Application

Contact Info

Product

Resources

About