PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Saito, Shunsuke; Huang, Zeng; Natsume, Ryota; Morishima, Shigeo; Li, Hao; Kanazawa, Angjoo

doi:10.1109/iccv.2019.00239

Cited by 1,073 publications

(1,052 citation statements)

References 98 publications

Supporting

Mentioning

990

Contrasting

Unclassified

Order By: Relevance

“…skirts or dresses. Implicit function based representations [71,56,46,48,31] might be beneficial to deal with different topologies, but they do not allow control. Although it is remarkable that our model can predict the occluded appearance of the person, the model struggles to predict high frequency detail and complex texture patterns.…”

Section: Discussionmentioning

confidence: 99%

“…Hence, in this work, we learn from complete texture maps obtained from 3D registrations. 3D person reconstruction from images While promising, recent methods for 3D person reconstruction either require video as input [6,7,8], scans [74], do not allow control over pose, shape and clothing [48,56], focus only on faces [72,32,63,57,47,62], or only on garments [68].…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

360-Degree Textures of People in Clothing from a Single Image

Lazova

Insafutdinov

Pons-Moll

2019

2019 International Conference on 3D Vision (3DV)

138

109

View full text Add to dashboard Cite

DensePose Garment segmentation Partial texture Completed texture Partial segmentation Completed segmentation Displacement maps Input view Fully-textured 3D avatar Figure 1: Given a single view of a person we predict a complete texture map in the UV space, complete clothing segmentation as well as a displacement map for the SMPL model [41], which we then combine to obtain a fully-textured 3D avatar. AbstractIn this paper we predict a full 3D avatar of a person from a single image. We infer texture and geometry in the UVspace of the SMPL model using an image-to-image translation method. Given partial texture and segmentation layout maps derived from the input view, our model predicts the complete segmentation map, the complete texture map, and a displacement map. The predicted maps can be applied to the SMPL model in order to naturally generalize to novel poses, shapes, and even new clothing. In order to learn our model in a common UV-space, we non-rigidly register the SMPL model to thousands of 3D scans, effectively encoding textures and geometries as images in correspondence. This turns a difficult 3D inference task into a simpler image-toimage translation one. Results on rendered scans of people and images from the DeepFashion dataset demonstrate that our method can reconstruct plausible 3D avatars from a single image. We further use our model to digitally change pose, shape, swap garments between people and edit clothing. To encourage research in this direction we will make the source code available for research purpose [5].

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

360-Degree Textures of People in Clothing from a Single Image

Lazova

Insafutdinov

Pons-Moll

2019

2019 International Conference on 3D Vision (3DV)

138

109

View full text Add to dashboard Cite

show abstract

“…Varol et al [35] proposed a synthetic human dataset for monocular model based human segmentation and depth estimation. However, synthetic data trained models suffer from limitations on real world images in high-frequency depth estimation of the human body [29]. [13] introduced another synthetic human dataset to train multi-view surface estimation network.…”

Section: Related Workmentioning

confidence: 99%

Learning Dense Wide Baseline Stereo Matching for People

Caliskan

Mustafa

Imre

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

View full text Add to dashboard Cite

Existing methods for stereo work on narrow baseline image pairs giving limited performance between wide baseline views. This paper proposes a framework to learn and estimate dense stereo for people from wide baseline image pairs. A synthetic people stereo patch dataset (S2P2) is introduced to learn wide baseline dense stereo matching for people. The proposed framework not only learns human specific features from synthetic data but also exploits pooling layer and data augmentation to adapt to real data. The network learns from the human specific stereo patches from the proposed dataset for wide-baseline stereo estimation. In addition to patch match learning, a stereo constraint is introduced in the framework to solve wide baseline stereo reconstruction of humans. Quantitative and qualitative performance evaluation against state-of-the-art methods of proposed method demonstrates improved wide baseline stereo reconstruction on challenging datasets. We show that it is possible to learn stereo matching from synthetic people dataset and improve performance on real datasets for stereo reconstruction of people from narrow and wide baseline stereo data.

show abstract

“…Existing state-of-the-art virtual try-on systems require a depth camera for tracking and overlay the human body with the t garment. Saito et al [27] introduced a novel pixel-aligned implicit function, which spatially aligns the pixel-level information of the input image with the shape of the 3D object, for deep learning-based 3D shape and texture inference of clothed humans from a single input image. But this method should train an encoder to learn individual feature vectors for each pixel of an image, which is timeconsuming.…”

Section: Image-basedmentioning

confidence: 99%

Generation of Realistic Virtual Garments on Recovery Human Model

Zhu

Peng

2019

Mathematical Problems in Engineering

View full text Add to dashboard Cite

Displaying a variety of fabrics on a customized character could help customers choose which fabric is more suitable for themselves and help customers choose clothing. However, it is not an easy task to show realistic garment on customized virtual character. As a result, we propose a stable finite element method (FEM) model which is stable to approximate stretching behaviors. At first, we measure four kinds of cloth materials with measurement techniques to research elastic deformations in real cloth samples. Then, we use the parameter optimization method by fitting the model with measurement data. For promoting the display of realistic fabrics, we recover 3D human in shape and pose from a single image automatically. Human body datasets are constructed at first. Then, CNN-based image retrieval in shape and skeleton-based template matching method in pose are combined for 3D human model recovery. To enrich human body details, we synthesize the human body and 3D face with spatial transformation. We compared our proposed method of recovering 3D human from a single image with the state-of-the-art methods, and the experimental results show that the proposed method allows the recovered virtual human to put on garment with different fabrics and significantly improves the fidelity of virtual garment.

show abstract

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

Cited by 1,073 publications

References 98 publications

360-Degree Textures of People in Clothing from a Single Image

360-Degree Textures of People in Clothing from a Single Image

Learning Dense Wide Baseline Stereo Matching for People

Generation of Realistic Virtual Garments on Recovery Human Model

Contact Info

Product

Resources

About