2017 IEEE International Conference on Robotics and Automation (ICRA) 2017
DOI: 10.1109/icra.2017.7989233
|View full text |Cite
|
Sign up to set email alerts
|

6-DoF object pose from semantic keypoints

Abstract: Abstract-This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image. The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model. Unlike prior work, we are agnostic to whether the object is textured or textureless, as the convnet learns the optimal representation from the available training image data. Furthermore, the approach can be applied to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
268
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 343 publications
(269 citation statements)
references
References 39 publications
(54 reference statements)
1
268
0
Order By: Relevance
“…While the keypoint matching problem can be solved using machine learning, deep CNN-based feature learning methods typically fix the 2D-3D keypoint associations and learn to predict the image locations of each corresponding 3D keypoint such as [26,25,35]. They mainly differ in model architecture and the choice of keypoints.…”
Section: Monocular Pose Estimationmentioning
confidence: 99%
See 2 more Smart Citations
“…While the keypoint matching problem can be solved using machine learning, deep CNN-based feature learning methods typically fix the 2D-3D keypoint associations and learn to predict the image locations of each corresponding 3D keypoint such as [26,25,35]. They mainly differ in model architecture and the choice of keypoints.…”
Section: Monocular Pose Estimationmentioning
confidence: 99%
“…They mainly differ in model architecture and the choice of keypoints. For in-stance, [25] uses semantic keypoints while [35] chooses the vertices of the 3D bounding box of an object. In our spaceborne scenario, objects are typically not occluded and have relatively rich texture.…”
Section: Monocular Pose Estimationmentioning
confidence: 99%
See 1 more Smart Citation
“…The prediction is further refined with independently computed viewpoints. The human pose estimation by [16] has been modified by [19,36] to detected 3D keypoints of multiple rigid classes to consequently estimate the translation and rotation of the object by fitting the keypoints into a shape model.…”
Section: Keypoint Estimationmentioning
confidence: 99%
“…Due to space constraints, we concentrate our review on CNN-based methods, which can be grouped into two categories. Methods in the first category, such as [21] and [13], predict 2D keypoints from an image and then use 3D object models to predict the 3D pose given these keypoints. Methods in the second category, such as Viewpoints and Keypoints (V&K) [20] and Render-for-CNN [17], which are closer to what we do, predict 3D pose directly given an image.…”
Section: Introductionmentioning
confidence: 99%