2021
DOI: 10.48550/arxiv.2112.08493
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

StyleMC: Multi-Channel Based Fast Text-Guided Image Generation and Manipulation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…Aiming for text-guided image inpainting, Bau et al [130] define a semantic consistency loss based on CLIP that optimizes latent codes inside the inpainting region to achieve semantic consistency with the given text. StyleClip [29] and StyleMC [131] use pre-trained CLIP as the loss supervision to match the manipulated results with the text condition as illustrated in Fig. 8.…”
Section: Gan Inversionmentioning
confidence: 99%
“…Aiming for text-guided image inpainting, Bau et al [130] define a semantic consistency loss based on CLIP that optimizes latent codes inside the inpainting region to achieve semantic consistency with the given text. StyleClip [29] and StyleMC [131] use pre-trained CLIP as the loss supervision to match the manipulated results with the text condition as illustrated in Fig. 8.…”
Section: Gan Inversionmentioning
confidence: 99%
“…Similarly, StyleClip [6] (illustrated in the third row of Fig. 2.2) and StyleMC [135] use cosine similarity between CLIP representations of texts and images to supervise text-guided manipulation. A known issue with the standard CLIP loss is the adversarial solution [136], where the model tends to fool the CLIP classifier by adding meaningless pixel-level perturbations to the image.…”
Section: D-aware Generative Modelsmentioning
confidence: 99%