2021
DOI: 10.48550/arxiv.2110.12427
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Image-Based CLIP-Guided Essence Transfer

Abstract: The conceptual blending of two signals is a semantic task that may underline both creativity and intelligence. We propose to perform such blending in a way that incorporates two latent spaces: that of the generator network and that of the semantic network. For the first network, we employ the powerful StyleGAN generator, and for the second, the powerful image-language matching network of CLIP. The new method creates a blending operator that is optimized to be simultaneously additive in both latent spaces. Our … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 17 publications
(35 reference statements)
0
6
0
Order By: Relevance
“…Hence, they fail to achieve the same quality. Chefer et al [2021] utilize CLIP to blend two facial images, demonstrating better preservation of the original identity while successfully transferring meaningful semantic features from the desired target images. find meaningful directions in CLIP-space in an unsupervised manner, map them to latent-space directions, and use CLIP to automatically generate natural language descriptions for these directions.…”
Section: Latent Space Editingmentioning
confidence: 99%
“…Hence, they fail to achieve the same quality. Chefer et al [2021] utilize CLIP to blend two facial images, demonstrating better preservation of the original identity while successfully transferring meaningful semantic features from the desired target images. find meaningful directions in CLIP-space in an unsupervised manner, map them to latent-space directions, and use CLIP to automatically generate natural language descriptions for these directions.…”
Section: Latent Space Editingmentioning
confidence: 99%
“…These studies have revealed that the latent space is disentangled in different degrees, and therefore is suitable in various tasks. Due to the disentangle property in W and S spaces, large number of works Chefer et al [2021], , Roich et al…”
Section: Feature Disentanglement In Stylegan Latent Spacementioning
confidence: 99%
“…using text-driven latent manipulation , Gal et al [2021], . Furthermore, reference images/videos have also been considered Chefer et al [2021], , Lewis et al [2021] to pinpoint the generation process.…”
Section: Introductionmentioning
confidence: 99%
“…VQGAN-CLIP [8,9,45] leverage CLIP for text-guided image generation. Concurrent work uses CLIP to fine-tune a pre-trained StyleGAN [11], and for image stylization [6]. Another concurrent work uses the ShapeNet dataset [5] and CLIP to perform unconditional 3D voxel generation [48].…”
Section: Related Workmentioning
confidence: 99%