2023
DOI: 10.48550/arxiv.2302.11797
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Region-Aware Diffusion for Zero-shot Text-driven Image Editing

Abstract: Fig. 1. The results of the proposed region-aware diffusion model (RDM). The texts adhere to the phrase rule "A → B", indicating that RDM transforms entity A into entity B.Image manipulation under the guidance of textual descriptions has recently received a broad range of attention. In this study, we focus on the regional editing of images with the guidance of given text prompts. Different from current mask-based image editing methods, we propose a novel region-aware diffusion model (RDM) for entity-level image… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 42 publications
0
5
0
Order By: Relevance
“…Recent entity-level editing methods (Huang et al 2023;Hertz et al 2022) have been inspired by exerting control over the latent space or attention maps (Chen, Laina, and Vedaldi 2023). Their constraints on the initial image layout hinder the ability to make substantial structural modifications, not to mention the process of object addition or removal.…”
Section: Intoductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Recent entity-level editing methods (Huang et al 2023;Hertz et al 2022) have been inspired by exerting control over the latent space or attention maps (Chen, Laina, and Vedaldi 2023). Their constraints on the initial image layout hinder the ability to make substantial structural modifications, not to mention the process of object addition or removal.…”
Section: Intoductionmentioning
confidence: 99%
“…1. To mitigate the interference issue, we employ a pre-trained CLIP segmentation model from RDM (Huang et al 2023), denoted as guidance model Φ, to impose spatial-aware guidance.…”
Section: Multi-region-guided Diffusionmentioning
confidence: 99%
See 1 more Smart Citation
“…FISEdit [221], Blended Latent Diffusion [222], PFB-Diff [223], DiffEdit [224], RDM [225], MFL [226], Differential Diffusion [227], Watch Your Steps [228], Blended Diffusion [229], ZONE [230], Inpaint Anything [231] Multi-Noise Redirection The Stable Artist [232], SEGA [233], LEDITS [234], OIR-Diffusion [235] Fig. 7: Taxonomy of training and finetuning free approaches for image editing.…”
Section: Training and Finetuning Free Approachesmentioning
confidence: 99%
“…This selective editing approach safeguards unedited regions and preserves their semantic integrity. RDM [225] introduces a region-aware diffusion model that seamlessly integrates masks to automatically pinpoint and edit regions of interest based on text-driven guidance. MFL [226] proposes a two-stage mask-free training paradigm tailored for textguided image editing.…”
Section: Mask Guidancementioning
confidence: 99%