Temporally Consistent Semantic Video Editing

Xu, Yang; AlBahar, Badour; Huang, Jia-Bin

doi:10.1007/978-3-031-19784-0_21

Cited by 14 publications

(10 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Other methods combine the optimization and encoder approaches and propose hybrid strategies by using the encoder for initialization and refining the latent code by optimization [35,95]. Recent 2D GAN inversion methods achieve faithful reconstruction with high editing capabilities and have been extended for video editing [2,75,88]. However, editing 3D-related attributes such as camera parameters and head pose remains inconsistent and prone to severe flickering as the pre-trained generator is unaware of the 3D structure.…”

Section: Gan Inversion Gan Inversion Maps a Real Image Backmentioning

confidence: 99%

DisCO: Portrait Distortion Correction with Perspective-Aware 3D GANs

Wang¹,

Liu²,

Huang³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

Section: Gan Inversion Gan Inversion Maps a Real Image Backmentioning

confidence: 99%

DisCO: Portrait Distortion Correction with Perspective-Aware 3D GANs

Wang¹,

Liu²,

Huang³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Generative models have demonstrated remarkable ability in synthesizing photorealistic images, including human faces [27]. Recent work has extended these models to add intuitive semantic editing, such as synthesis of glasses on faces [20,30,35,70,73]. Fader Networks [30] disentangle the salient image information, and then generate different images by varying attribute values, including glasses on faces.…”

Section: Related Workmentioning

confidence: 99%

“…Subsequent work has proposed two decoders for modeling latent representations and facial attributes [20], selective transfer units [35], and geometryaware flow [76] to further improve editing fidelity. Yao et al [73] extend facial attribute editing to video sequences via latent transformation and a identity preservation loss, which is further improved by Xu et al [70], incorporating flow-based consistency. More recent works propose 3Daware generative models to achieve view-consistent synthesis [8,10,49,55,63,67,71].…”

Section: Related Workmentioning

confidence: 99%

“…VideoEditGAN [70] is a SOTA image-based editing method that allows us to insert glasses on face images. As shown in Fig.…”

Section: Y X H H Q T D 3 S H G 7 U U + 7 W E O O 3 L J / M W U / U T ...mentioning

confidence: 99%

“…Another group of approaches aims to synthesize the composition of glasses in the image domain [30,70,73] by leveraging powerful 2D generative models [27]. While these approaches can produce photorealistic images, animation results typically suffer from view and temporal inconsistencies due to the lack of 3D information.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations