CLIP2StyleGAN: Unsupervised Extraction of StyleGAN Edit Directions

Abdal, Rameen; Zhu, Peihao; Femiani, John; Mitra, Niloy J.; Wonka, Peter

doi:10.1145/3528233.3530747

“…TediGAN [51] proposes to generate an image corresponding to a given text by training an encoder to map the text into the StyleGAN latent space. Several recent works [1,10] jointly utilize pre-trained generative models [4, 12,14] and CLIP to steer the generated result towards the desired target description. More recently, diffusion models (DM) [17,33,41] achieves state-of-the-art results on test-to-image synthesis by decomposing the image formation process into a sequential application of denoising autoencoders.…”

Section: Text-based Image Generationmentioning

confidence: 99%

TextIR: A Simple Framework for Text-based Editable Image Restoration

Bai¹,

Wang²,

Xie³

et al. 2023

Preprint

Get access via publisher Add to dashboard Cite

Exaggerated anticipatory anxiety is common in social anxiety disorder (SAD). Neuroimaging studies have revealed altered neural activity in response to social stimuli in SAD, but fewer studies have examined neural activity during anticipation of feared social stimuli in SAD. The current study examined the time course and magnitude of activity in threat processing brain regions during speech anticipation in socially anxious individuals and healthy controls (HC). Method Participants (SAD n = 58; HC n = 16) underwent functional magnetic resonance imaging (fMRI) during which they completed a 90s control anticipation task and 90s speech anticipation task.

“…With the introduction of StyleGAN [14][15][16] mapping networks, the input random noise can be first mapped to another latent space that has disentangled semantics, then the model can generate images with better quality. Further, exploring the latent space of StyleGAN has been proved useful by several works [1,17,33] in text-driven image synthe- sis and manipulation tasks, where they utilize the pretrained vision-language model CLIP [37] model to manipulate pretrained unconditional StyleGAN networks. In order to relieve the need for paired text data during the training phase, Lafite [55] proposes to adopt the image CLIP embeddings as training input while using text CLIP embedding during the inference phase.…”

Section: Text-to-image Synthesismentioning

confidence: 99%

TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

Wei¹,

Wang²,

Feng³

et al. 2023

Preprint

1

0

Get access via publisher Add to dashboard Cite

Exaggerated anticipatory anxiety is common in social anxiety disorder (SAD). Neuroimaging studies have revealed altered neural activity in response to social stimuli in SAD, but fewer studies have examined neural activity during anticipation of feared social stimuli in SAD. The current study examined the time course and magnitude of activity in threat processing brain regions during speech anticipation in socially anxious individuals and healthy controls (HC). Method Participants (SAD n = 58; HC n = 16) underwent functional magnetic resonance imaging (fMRI) during which they completed a 90s control anticipation task and 90s speech anticipation task.

show abstract

“…Such common extensions include retrieval [4], image generation [11], continual learning [61], object detection, few-shot recognition [22], semantic segmentation [47], etc. Others leveraged CLIP's image/text encoders with StyleGAN [45] to enable intuitive text-based semantic image manipulation [1]. In our work we adapt CLIP for zero-shot FG-SBIR in a cross-category setting.…”

Section: Clip In Vision Tasksmentioning

confidence: 99%

CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not

Sain¹,

Bhunia²,

Chowdhury³

et al. 2023

Preprint

0

Get access via publisher Add to dashboard Cite

Exaggerated anticipatory anxiety is common in social anxiety disorder (SAD). Neuroimaging studies have revealed altered neural activity in response to social stimuli in SAD, but fewer studies have examined neural activity during anticipation of feared social stimuli in SAD. The current study examined the time course and magnitude of activity in threat processing brain regions during speech anticipation in socially anxious individuals and healthy controls (HC). Method Participants (SAD n = 58; HC n = 16) underwent functional magnetic resonance imaging (fMRI) during which they completed a 90s control anticipation task and 90s speech anticipation task.

show abstract

CLIP2StyleGAN: Unsupervised Extraction of StyleGAN Edit Directions

Cited by 68 publications

References 28 publications

TextIR: A Simple Framework for Text-based Editable Image Restoration

TextIR: A Simple Framework for Text-based Editable Image Restoration

TAPS3D: Text-Guided 3D Textured Shape Generation from Pseudo Supervision

CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not

Contact Info

Product

Resources

About