i2c-net: Using Instance-Level Neural Networks for Monocular Category-Level 6D Pose Estimation

Remus, Alberto; D’Avella, Salvatore; Felice, Francesco Di; Tripicchio, Paolo; Avizzano, Carlo Alberto

doi:10.1109/lra.2023.3240362

Cited by 9 publications

(4 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Indeed, the statistics shown by the BOP benchmark reported that the scores suffered from a huge drop even at low levels of occlusion, as demonstrated by the 30% gap of difference in performance obtained in LINEMODE and Occluded-LINEMODE that provides the same objects but partially occluded. Estimating the 6D pose of objects is an active field with important practical implications, and after 2018, other works [18], [19] have been published, showing a margin of improvement for several aspects. Therefore, the authors believe that in the near future, such methods can be employed for the proposed benchmark to automatically detect the occlusion percentage of cluttered scenes in the evaluation metric, but in the meanwhile, manual segmentation guarantees more accurate measurements.…”

Section: Evaluation Metricsmentioning

confidence: 99%

The Cluttered Environment Picking Benchmark (CEPB) for Advanced Warehouse Automation: Evaluating the Perception, Planning, Control, and Grasping of Manipulation Systems

D'Avella,

Bianchi,

Sundaram

et al. 2024

IEEE Robot. Automat. Mag.

View full text Add to dashboard Cite

Autonomous and reliable robotic grasping is a desirable functionality in robotic manipulation and is still an open problem. Standardized benchmarks are important tools for evaluating and comparing robotic grasping and manipulation systems among different research groups and also for sharing with the community the best practices to learn from errors. An ideal benchmarking protocol should encompass the different aspects underpinning grasp execution, including the mechatronic design of grippers, planning, perception, and control to give information on each aspect and the overall problem. This article gives an overview of the benchmarks, datasets, and competitions that have been proposed and adopted in the last few years and presents a novel benchmark with protocols for different tasks that evaluate both the single components of the system and the system as a whole, introducing an evaluation metric that allows for a fair comparison in highly cluttered scenes taking into account the difficulty of the clutter. A website dedicated to the benchmark containing information on the different tasks, maintaining the leaderboards, and serving as a contact point for the community is also provided.

show abstract

Section: Evaluation Metricsmentioning

confidence: 99%

The Cluttered Environment Picking Benchmark (CEPB) for Advanced Warehouse Automation: Evaluating the Perception, Planning, Control, and Grasping of Manipulation Systems

D'Avella,

Bianchi,

Sundaram

et al. 2024

IEEE Robot. Automat. Mag.

View full text Add to dashboard Cite

show abstract

“…The boom of deep learning has significantly improved object pose estimation. A series of methods have been proposed to holistically estimate object poses from monocular color images [1], [26], [27], [28] or with the aid from depth sensors [29], [30], [31], [32], [33], [34]. These methods took advantage of the CNNs' regression ability to learn mapping functions from the observed images to object poses.…”

Section: Related Workmentioning

confidence: 99%

RNNPose: 6-DoF Object Pose Estimation via Recurrent Correspondence Field Estimation and Pose Optimization

Xu,

Lin,

Zhang

et al. 2024

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

6-DoF object pose estimation from a monocular image is a challenging problem, where a post-refinement procedure is generally needed for high-precision estimation. In this paper, we propose a framework, dubbed RNNPose, based on a recurrent neural network (RNN) for object pose refinement, which is robust to erroneous initial poses and occlusions. During the recurrent iterations, object pose refinement is formulated as a non-linear least squares problem based on the estimated correspondence field (between a rendered image and the observed image). The problem is then solved by a differentiable Levenberg-Marquardt (LM) algorithm enabling end-to-end training. The correspondence field estimation and pose refinement are conducted alternately in each iteration to improve the object poses. Furthermore, to improve the robustness against occlusion, we introduce a consistency-check mechanism based on the learned descriptors of the 3D model and observed 2D images, which downweights the unreliable correspondences during pose optimization. We evaluate RNNPose on several public datasets, including LINEMOD, Occlusion-LINEMOD, YCB-Video and TLESS. We demonstrate state-of-the-art performance and strong robustness against severe clutter and occlusion in the scenes. Extensive experiments validate the effectiveness of our proposed method. Besides, the extended system based on RNNPose successfully generalizes to multi-instance scenarios and achieves top-tier performance on the TLESS dataset.

show abstract

“…Therefore, an important goal in this ongoing industrial revolution is to make such algorithms robust to clutter to increase the flexibility of the next-generation of robots. Estimating the pose of objects is an active field with important practical implications, and in the last years, some works ( Song et al, 2020 ; Remus et al, 2023 ) have been published showing a margin of improvement for several aspects.…”

Section: Introductionmentioning

confidence: 99%

CEPB dataset: a photorealistic dataset to foster the research on bin picking in cluttered environments

Tripicchio,

D’Avella,

Avizzano

2024

Front. Robot. AI

Self Cite

View full text Add to dashboard Cite

Several datasets have been proposed in the literature, focusing on object detection and pose estimation. The majority of them are interested in recognizing isolated objects or the pose of objects in well-organized scenarios. This work introduces a novel dataset that aims to stress vision algorithms in the difficult task of object detection and pose estimation in highly cluttered scenes concerning the specific case of bin picking for the Cluttered Environment Picking Benchmark (CEPB). The dataset provides about 1.5M virtually generated photo-realistic images (RGB + depth + normals + segmentation) of 50K annotated cluttered scenes mixing rigid, soft, and deformable objects of varying sizes used in existing robotic picking benchmarks together with their 3D models (40 objects). Such images include three different camera positions, three light conditions, and multiple High Dynamic Range Imaging (HDRI) maps for domain randomization purposes. The annotations contain the 2D and 3D bounding boxes of the involved objects, the centroids’ poses (translation + quaternion), and the visibility percentage of the objects’ surfaces. Nearly 10K separated object images are presented to perform simple tests and compare them with more complex cluttered scenarios tests. A baseline performed with the DOPE neural network is reported to highlight the challenges introduced by the novel dataset.

show abstract

i2c-net: Using Instance-Level Neural Networks for Monocular Category-Level 6D Pose Estimation

Cited by 9 publications

References 31 publications

The Cluttered Environment Picking Benchmark (CEPB) for Advanced Warehouse Automation: Evaluating the Perception, Planning, Control, and Grasping of Manipulation Systems

The Cluttered Environment Picking Benchmark (CEPB) for Advanced Warehouse Automation: Evaluating the Perception, Planning, Control, and Grasping of Manipulation Systems

RNNPose: 6-DoF Object Pose Estimation via Recurrent Correspondence Field Estimation and Pose Optimization

CEPB dataset: a photorealistic dataset to foster the research on bin picking in cluttered environments

Contact Info

Product

Resources

About