2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01121
|View full text |Cite
|
Sign up to set email alerts
|

Unsupervised Part-Based Disentangling of Object Shape and Appearance

Abstract: Large intra-class variation is the result of changes in multiple object characteristics. Images, however, only show the superposition of different variable factors such as appearance or shape. Therefore, learning to disentangle and represent these different characteristics poses a great challenge, especially in the unsupervised case. Moreover, large object articulation calls for a flexible part-based model. We present an unsupervised approach for disentangling appearance and shape by learning parts consistentl… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
192
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 156 publications
(193 citation statements)
references
References 49 publications
(100 reference statements)
1
192
0
Order By: Relevance
“…Jakab & Gupta et al [25], the most related, is described in the introduction. Lorenz et al [33], Zhang et al [74] develop an auto-encoding formulation to discover landmarks as explicit structural representations for a given…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Jakab & Gupta et al [25], the most related, is described in the introduction. Lorenz et al [33], Zhang et al [74] develop an auto-encoding formulation to discover landmarks as explicit structural representations for a given…”
Section: Related Workmentioning
confidence: 99%
“…Our method allows for similar but more fine-grained conditional image generation, conditioned on an appearance image or object landmarks. Many unsupervised methods for pose estimation [25,33,50,67,74] share similar ability. However, we can achieve more accurate and predictable image editing by manipulating semantic parts in the image through their corresponding landmarks.…”
Section: Conditional Image Decodermentioning
confidence: 99%
See 1 more Smart Citation
“…Pose guided image and video generation Given a source image and a target 2D pose, image-based methods [43,44,22,59,49] produce an image with the source appearance in the target pose. To deal with pixel miss-alignments, it is helpful to transform the pixels in the original image to match the target pose within the network [60,11,42]. Following similar ideas to growing GANs [36], high resolution anime characters can be generated [29].…”
Section: Related Workmentioning
confidence: 99%
“…[17] proposes a cycle-consistent VAE which adds a cyclic loss to the VAE objective. [40] directly models π as keypoints. All of these methods rely on the same basic principle for disentanglement: Constraining the amount of information in π.…”
Section: Disentangling Without Pose-annotationsmentioning
confidence: 99%