2022
DOI: 10.1007/978-3-031-19778-9_40
|View full text |Cite
|
Sign up to set email alerts
|

Image-Based CLIP-Guided Essence Transfer

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(9 citation statements)
references
References 37 publications
0
9
0
Order By: Relevance
“…Leveraging these powerful generative models, many have attempted to utilize such models for downstream editing tasks [9,18,21,25,29,47]. Most text-guided generation techniques condition the diffusion model directly on embeddings extracting from a pretrained text encoder [3,5,6,18,31]. In this work, we utilize a Latent Diffusion Model [35] paired with a Diffusion Prior model [33,39] and show its benefits in the context of creative generation.…”
Section: Related Workmentioning
confidence: 99%
“…Leveraging these powerful generative models, many have attempted to utilize such models for downstream editing tasks [9,18,21,25,29,47]. Most text-guided generation techniques condition the diffusion model directly on embeddings extracting from a pretrained text encoder [3,5,6,18,31]. In this work, we utilize a Latent Diffusion Model [35] paired with a Diffusion Prior model [33,39] and show its benefits in the context of creative generation.…”
Section: Related Workmentioning
confidence: 99%
“…To provide users with more control over the synthesis process, several works employ a segmentation map or spatial conditioning [4,17,54]. In the context of image editing, while most methods are generally limited to global edits [9,14,19,26], several works introduce a userprovided mask to specify the region that should be altered [3,7,13,34].…”
Section: Related Workmentioning
confidence: 99%
“…Several previous works have attempted to combine GAN and CLIP to achieve text-to-image generation [4,46,65]. Specifically, StyleGAN [22,23,21,51] focuses on the latent space to enable better control over generated images.…”
Section: Text-to-image Manipulation/generationmentioning
confidence: 99%