2022
DOI: 10.1109/tpami.2021.3078270
|View full text |Cite
|
Sign up to set email alerts
|

Liquid Warping GAN With Attention: A Unified Framework for Human Image Synthesis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(12 citation statements)
references
References 59 publications
0
12
0
Order By: Relevance
“…The main limitation of these methods is that they lack multi-view and temporal consistency under significant viewpoint and pose changes since these 2D-based method generally do not learn any notion of 3D space. Instead of fully relying on the image-space translation network, Liquid Warping GAN [15,16] uses UV-correspondences between the source and target meshes to explicitly warp the source image to the target pose and then uses an additional network to refine the warped image. Given a target pose in form of 2D keypoints, Huang et al [8] predict the UV-coordinates of a dense mesh in image space and then generate the target image by sampling from a learned UV texture map.…”
Section: D Neural Rendering Methodsmentioning
confidence: 99%
“…The main limitation of these methods is that they lack multi-view and temporal consistency under significant viewpoint and pose changes since these 2D-based method generally do not learn any notion of 3D space. Instead of fully relying on the image-space translation network, Liquid Warping GAN [15,16] uses UV-correspondences between the source and target meshes to explicitly warp the source image to the target pose and then uses an additional network to refine the warped image. Given a target pose in form of 2D keypoints, Huang et al [8] predict the UV-coordinates of a dense mesh in image space and then generate the target image by sampling from a learned UV texture map.…”
Section: D Neural Rendering Methodsmentioning
confidence: 99%
“…Existing 3D human generation approaches from a single image lack photorealism. Existing methods such as PIFu ] suffer from blurriness; Impersonator++ [Liu et al 2021b] tends to duplicate content from the front view, suffering from projection artifacts; TEXTure [Richardson et al 2023] fails to preserve the appearance of the input view and results in saturated colors; Magic123 [Qian et al 2023] fails to synthesize realistic shape and appearance. Images from Adobe Stock.…”
Section: Pifumentioning
confidence: 99%
“…While these models are unconditional, several works extend them to conditional generative models such that we can control poses while retaining the identity of an input subject. By incorporating additional conditions these works can achieve human reposing [AlBahar and Huang 2019;AlBahar et al 2021;Liu et al 2021b;Ma et al 2017Ma et al , 2018Men et al 2020;Siarohin et al 2018;Zhu et al 2019 2018] to warp input images to the target view as an initialization of the synthesis. Impersonator++ [Liu et al 2021b] further improves the robustness to a large pose change by leveraging a parametric human body model [Loper et al 2015] and warping blocks to better preserve the information from the input.…”
Section: Pifumentioning
confidence: 99%
“…Early methods are focused on retrieval-based pipelines to retarget the subject in a video to the desired motion with optical flow [11] or 3D human skeleton [41]. Recent methods usually adopt deep neural networks, especially GANs [13], and exploit 2D human pose [2,5,30,35,36], human parsing [8,21,26,39], 3D human pose [37], and other 3D information [20,22,31,44] for human motion transfer [23,24].…”
Section: Related Work 21 Human Motion Transfermentioning
confidence: 99%
“…Specifically, given an image of a source person and a target video with the desired motion, the human motion transfer method can generate a video of the source person performing the target motion. It is an emerging topic due to the potential applications like movie editing, virtual try-on, and online education [24]. Benefiting from the Generative Adversarial Networks (GANs) [13,36] with the pixel domain supervision, frame-level visual quality of some synthetic videos is significantly improved.…”
Section: Introductionmentioning
confidence: 99%