SC6D: Symmetry-agnostic and Correspondence-free 6D Object Pose Estimation

Cai, Dingding; Heikkilä, Janne; Rahtu, Esa

doi:10.1109/3dv57658.2022.00065

Cited by 7 publications

(4 citation statements)

References 57 publications

(131 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Object-Specific Pose Estimation. Most existing pose estimation methods [2,3,5,12,16,18,24,37,44,49,52] are object-specific pose estimators, which are specialized for pre-defined objects and cannot generalize to previously unseen objects without retraining. Some of them [2,3,5,18,49,52] directly regress the 6D pose parameters from RGB images by training deep neural networks on a large number of labeled images.…”

Section: Related Workmentioning

confidence: 99%

“…Most existing pose estimation methods [2,3,5,12,16,18,24,37,44,49,52] are object-specific pose estimators, which are specialized for pre-defined objects and cannot generalize to previously unseen objects without retraining. Some of them [2,3,5,18,49,52] directly regress the 6D pose parameters from RGB images by training deep neural networks on a large number of labeled images. While other approaches [5,12,16,24,36,37,44] establish 2D-3D correspondences between 2D images and 3D object models to estimate the 6D pose by solving the Perspective-n-Point (PnP) [21] problem.…”

Section: Related Workmentioning

confidence: 99%

“…These keyframes are processed through DINOv2 and Co-Segmenter to jointly predict object segmentation masks. These predicted masks are then utilized to extract the object semantic tokens (F obj ) from the keyframe feature tokens (2)…”

mentioning

confidence: 99%

“…We perform image segmentation for all reference images {I ref i } Nr i=1 using an Obj-Segmenter with the obtained semantic information F obj . Subsequently, we employ an RA-Encoder to extract the rotation-aware embeddings {V obj i } Nr i=1 from the segmented images (3)…”

mentioning

confidence: 99%

See 3 more Smart Citations

MSDA: Monocular Self-supervised Domain Adaptation for 6D Object Pose Estimation

Cai

Heikkilä

Rahtu

2023

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

This paper introduces GS-Pose, an end-to-end framework for locating and estimating the 6D pose of objects. GS-Pose begins with a set of posed RGB images of a previously unseen object and builds three distinct representations stored in a database. At inference, GS-Pose operates sequentially by locating the object in the input image, estimating its initial 6D pose using a retrieval approach, and refining the pose with a render-and-compare method. The key insight is the application of the appropriate object representation at each stage of the process. In particular, for the refinement step, we utilize 3D Gaussian splatting, a novel differentiable rendering technique that offers high rendering speed and relatively low optimization time. Off-the-shelf toolchains and commodity hardware, such as mobile phones, can be used to capture new objects to be added to the database. Extensive evaluations on the LINEMOD and OnePose-LowTexture datasets demonstrate excellent performance, establishing the new state-of-the-art. Project page: https://dingdingcai.github.io/gs-pose

show abstract