2019
DOI: 10.48550/arxiv.1906.05105
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Pose from Shape: Deep Pose Estimation for Arbitrary 3D Objects

Abstract: Most deep pose estimation methods need to be trained for specific object instances or categories. In this work we propose a completely generic deep pose estimation approach, which does not require the network to have been trained on relevant categories, nor objects in a category to have a canonical pose. We believe this is a crucial step to design robotic systems that can interact with new objects "in the wild" not belonging to a predefined category. Our main insight is to dynamically condition pose estimation… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(13 citation statements)
references
References 51 publications
0
13
0
Order By: Relevance
“…Although deep learning methods for 6D pose estimation have achieved very accurate results, most such methods are trained for particular objects and do not generalize to unseen objects without retraining, which can take tens of GPU-days [20], [2], [21], [4]. Several recent works tackled the zero-shot pose estimation problem by learning a latent object representation [6], [7], [8]. However, recent analysis has revealed that such methods perform poorly in cluttered scenes, even when a ground-truth bounding box is provided as input [9].…”
Section: Related Work a Zero-shot Pose Estimationmentioning
confidence: 99%
See 1 more Smart Citation
“…Although deep learning methods for 6D pose estimation have achieved very accurate results, most such methods are trained for particular objects and do not generalize to unseen objects without retraining, which can take tens of GPU-days [20], [2], [21], [4]. Several recent works tackled the zero-shot pose estimation problem by learning a latent object representation [6], [7], [8]. However, recent analysis has revealed that such methods perform poorly in cluttered scenes, even when a ground-truth bounding box is provided as input [9].…”
Section: Related Work a Zero-shot Pose Estimationmentioning
confidence: 99%
“…To address this issue, a number of zero-shot pose estimators have been developed. However, most zero-shot pose estimators only evaluate on sparse, uncluttered scenes where the object of interest is detected and cropped or is sitting on an empty table [6], [7], [8]. Evaluation of such methods in cluttered settings shows that such methods fail to provide reasonable performance, even with the addition of groundtruth bounding boxes or ground truth translation as input (see analysis in Okorn, et al [9] Appendix B).…”
Section: Introductionmentioning
confidence: 99%
“…Approaches to 3D objectbased mapping can be broadly classified into two categories: learning-based and geometry-based. The first category mostly extends existing 2D detectors to also output 3D bounding box from single images [23,26,28,33,54,58]. If a video sequence is available, the single-view 3D estimations can be fused using a filter or a LSTM to create a consistent mapping of the scene [5,20,24].…”
Section: Related Workmentioning
confidence: 99%
“…Rendering multiple simulated views from the 3D model of the target object can be used to leverage deep transfer learning for pose estimation [21]. Garon et.…”
Section: Related Workmentioning
confidence: 99%