2022
DOI: 10.1007/978-3-031-20077-9_24
|View full text |Cite
|
Sign up to set email alerts
|

DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 44 publications
0
9
0
Order By: Relevance
“…Novel views. Instead of explicitly regressing the shape, Chen et al [21] and DISP6D [41] employ neural networks that generate RGB and optionally depth views given a latent shape representation and viewing direction. In contrast to neural field representations (see Continuous SDFs below), multi-view consistency is not enforced by these methods, that is, there is no guarantee that the resulting views are consistent with each other.…”
Section: Shape Representationmentioning
confidence: 99%
See 1 more Smart Citation
“…Novel views. Instead of explicitly regressing the shape, Chen et al [21] and DISP6D [41] employ neural networks that generate RGB and optionally depth views given a latent shape representation and viewing direction. In contrast to neural field representations (see Continuous SDFs below), multi-view consistency is not enforced by these methods, that is, there is no guarantee that the resulting views are consistent with each other.…”
Section: Shape Representationmentioning
confidence: 99%
“…At inference time, the orientation can be inferred by finding the closest stored latent representation (and its associated orientation) to the encoded input. This approach automatically learns any form of ambiguity and has been adapted to categorical pose estimation by iCaps[22] and DISP6D[41]. iCaps uses a categorical reference object when training the decoder (i.e., implying category-level ambiguities), whereas DISP6D modifies the codebook at inference time, which in principle supports inference of object-level ambiguities.…”
mentioning
confidence: 99%
“…Category-level pose estimation aims at generalizing to unknown objects of the same category [32], [16], [61], [69], [24], [106]. As compared to instance-level object pose estimation, which learn to predict poses of known objects.…”
Section: G Category-level Trainingmentioning
confidence: 99%
“…As compared to instance-level object pose estimation, which learn to predict poses of known objects. Common principles are encoding canonical or category-specific features [61], [16], [24], [106], render-and-compare [69] and fast adaptation [32].…”
Section: G Category-level Trainingmentioning
confidence: 99%
“…For example, ShAPO [13] jointly predicts object shape, pose, and size in a single-shot manner. DISP6D [34] disentangles the latent representation of shape and pose into two sub-spaces, improving the scalability and generality. Neural Radiance Fields (NeRF) [24] provides a mechanism for capturing complex 3D structures from only one or a few RGB images, which is also applicable to object pose estimation.…”
Section: Implicit Field For Pose Estimationmentioning
confidence: 99%