2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023
DOI: 10.1109/wacv56688.2023.00037
|View full text |Cite
|
Sign up to set email alerts
|

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 82 publications
(21 citation statements)
references
References 35 publications
0
21
0
Order By: Relevance
“…Recently powerful controlling mechanisms [46,21,17] emerged to guide the diffusion process for text-to-image generation. Particularly, ControlNet [46] enables to condition the generation process using edges, pose, semantic masks, image depths, etc.…”
Section: Conditional and Specialized Text-to-videomentioning
confidence: 99%
“…Recently powerful controlling mechanisms [46,21,17] emerged to guide the diffusion process for text-to-image generation. Particularly, ControlNet [46] enables to condition the generation process using edges, pose, semantic masks, image depths, etc.…”
Section: Conditional and Specialized Text-to-videomentioning
confidence: 99%
“…To further explore the extensibility of diffusion models, many works have been devoted to diffusion-based conditional generation, which can be broadly classified into two categories. The first one is the approach known as classifier-guidance (Liu et al, 2023), which utilizes a classifier to promote the sampling process of the pre-trained unconditional model. Despite the low cost, the generation effect is less competitive.…”
Section: Preliminaries and Backgroundmentioning
confidence: 99%
“…Both "Paint By Word" [5] and ManiGAN [28] are restricted to specific image domains and are not applicable to open natural images. SDG [31] and DiffusionCLIP [25] are proposed to utilize a diffusion model in order to perform global text-guided image manipulations. GLIDE [36] and DALL•E 2 [39] focus on text-driven open domain image synthesis, as well as local image editing.…”
Section: Text-guided Image Manipulationmentioning
confidence: 99%