Towards Part-Based Understanding of RGB-D Scans

Bokhovkin, Alexey; Ishimtsev, Vladislav; Bogomolov, Emil; Zorin, Denis; Artemov, Alexey; Burnaev, Evgeny; Dai, Angela

doi:10.1109/cvpr46437.2021.00740

Cited by 10 publications

(3 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To recover object shapes, some methods Groueix et al 2018;Wang et al 2018] reconstruct meshes from a template, and others [Huang et al 2018b;Izadinia et al 2017] adopt shape retrieval approaches to search from a given CAD database. Recently, some approaches [Dahnert et al 2021;Nie et al 2020;Popov et al 2020;Yang and Zhang 2016;Zhang et al 2021b] enable 3D scene understanding by generating a room layout, camera pose, object bounding boxes, or even meshes from a single view, automatically completing and annotating scene meshes [Bokhovkin et al 2021] or predicting object alignments and layouts [Avetisyan et al 2020] from an RGB-D scan. Inspired by PanoContext [Zhang et al 2014] that panoramic images contain richer context information than the perspective ones, Zhang et al [Zhang et al 2021a] propose a better 3D scene understanding method with panoramic captures as input.…”

Section: Related Workmentioning

confidence: 99%

Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects

Yang,

Zhang,

et al. 2022

Preprint

View full text Add to dashboard Cite

Section: Related Workmentioning

confidence: 99%

Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects

Yang,

Zhang,

et al. 2022

Preprint

View full text Add to dashboard Cite

“…In comparison, our focus is on learning part-based semantic and instance segmentation of noisy and fragmented real-world 3D scans. Very recently, initial approaches to semantic 3D segmentation have been proposed (Bokhovkin et al, 2021;Uy et al, 2019) but for a significantly less extensive part hierarchy. More specifically, (Bokhovkin et al, 2021) targets predicting part hierarchy at object and coarse parts levels, discarding smaller parts altogether; in contrast, we are able to predict parts at finer levels in the hierarchy.…”

Section: Related Workmentioning

confidence: 99%

Scan2Part: Fine-grained and Hierarchical Part-level Understanding of Real-World 3D Scans

Notchenko

Ishimtsev

Artemov

et al. 2022

Proceedings of the 17th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Application

Self Cite

View full text Add to dashboard Cite

We propose Scan2Part, a method to segment individual parts of objects in real-world, noisy indoor RGB-D scans. To this end, we vary the part hierarchies of objects in indoor scenes and explore their effect on scene understanding models. Specifically, we use a sparse U-Net-based architecture that captures the fine-scale detail of the underlying 3D scan geometry by leveraging a multi-scale feature hierarchy. In order to train our method, we introduce the Scan2Part dataset, which is the first large-scale collection providing detailed semantic labels at the part level in the real-world setting. In total, we provide 242,081 correspondences between 53,618 PartNet parts of 2,477 ShapeNet objects and 1,506 ScanNet scenes, at two spatial resolutions of 2 cm 3 and 5 cm 3 . As output, we are able to predict fine-grained per-object part labels, even when the geometry is coarse or partially missing. Overall, we believe that both our method as well as newly introduced dataset is a stepping stone forward towards structural understanding of real-world 3D environments.

show abstract

“…In the domain of object modeling, part-based approaches leverage computer vision techniques to track movements among object parts [22,23], exploit contextual relations from large datasets [24][25][26] or develop data-efficient learning methods [27,28]. These approaches aim to recognize and segment object parts, enhancing the understanding of complex object structures, but they do not yield a holistic representation of a scene that encompasses multiple objects.…”

Section: A Related Workmentioning

confidence: 99%

Scene Reconstruction with Functional Objects for Robot Autonomy

Han¹,

Zhang

Jiao

et al. 2022

Int J Comput Vis

View full text Add to dashboard Cite

Existing methods for reconstructing interactive scenes primarily focus on replacing reconstructed objects with CAD models retrieved from a limited database, resulting in significant discrepancies between the reconstructed and observed scenes. To address this issue, our work introduces a partlevel reconstruction approach that reassembles objects using primitive shapes. This enables us to precisely replicate the observed physical scenes and simulate robot interactions with both rigid and articulated objects. By segmenting reconstructed objects into semantic parts and aligning primitive shapes to these parts, we assemble them as CAD models while estimating kinematic relations, including parent-child contact relations, joint types, and parameters. Specifically, we derive the optimal primitive alignment by solving a series of optimization problems, and estimate kinematic relations based on part semantics and geometry. Our experiments demonstrate that part-level scene reconstruction outperforms object-level reconstruction by accurately capturing finer details and improving precision. These reconstructed part-level interactive scenes provide valuable kinematic information for various robotic applications; we showcase the feasibility of certifying mobile manipulation planning in these interactive scenes before executing tasks in the physical world.

show abstract

Towards Part-Based Understanding of RGB-D Scans

Cited by 10 publications

References 45 publications

Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects

Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects

Scan2Part: Fine-grained and Hierarchical Part-level Understanding of Real-World 3D Scans

Scene Reconstruction with Functional Objects for Robot Autonomy

Contact Info

Product

Resources

About