Resolution-robust Large Mask Inpainting with Fourier Convolutions

Suvorov, Roman; Logacheva, Elizaveta; Mashikhin, Anton; Remizova, Anastasia; Ashukha, Arsenii; Silvestrov, Aleksei; Kong, Naejin; Goka, Harshith; Park, Kiwoong; Lempitsky, Victor

doi:10.48550/arxiv.2109.07161

Cited by 18 publications

(51 citation statements)

References 50 publications

(110 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…As image inpainting requires a high-level semantic context, and to explicitly include it in the generation pipeline, there exist hand-crafted architectural designs such as Dilated Convolutions [13,38] to increase the receptive field, Partial Convolutions [16] and Gated Convolutions [41] to guide the convolution kernel according to the inpainted mask, Contextual Attention [39] to leverage on global information, Edges maps [7,22,36,37] or Semantic Segmentation maps [11,25] to further guide the generation, and Fourier Convolutions [32] to include both global and local information efficiently. Although recent works produce photo-realistic results, GANs are well known for textural synthesis, so these methods shine on background completion or removing objects, which require repetitive structural synthesis, and struggle with semantic synthesis (See Figure 5).…”

Section: Related Workmentioning

confidence: 99%

“…In our resampling approach, we use this DDPM property to harmonize the input of the model. Consequently, we diffuse the output x t−1 back to x t by sampling from (1) as ICT [35] Deep Fill v2 [40] LaMa [33] RePaint (ours) Since this operation can only harmonize one step, it might not be able to incorporate the semantic information over the entire denoising process. To overcome this problem, we denote the time horizon of this operation as jump length, which is j = 1 for the previous case.…”

Section: Resamplingmentioning

confidence: 99%

“…The autoregressive methods are DSI [27] and ICT [35], and the GAN methods are DeepFillv2 [40], AOT [44], and LaMa [33]. We use their publicly available pretrained models.…”

Section: Comparison With State-of-the-artmentioning

confidence: 99%

“…To validate our method on the standard image inpainting scenario, we use the LaMa [33] settings for Wide and Narrow masks. RePaint outperforms all other methods with a significance margin of 95% in both CelebA-HQ and ImageNet, for both Wide and Narrow settings.…”

Section: Wide and Narrow Masksmentioning

confidence: 99%

“…Inpainting approaches thus require strong generative capabilities. To this end, current approaches [17,32,41,44] rely on GANs [6] or Autoregressive Modeling [27,34,42]. Moreover, inpainting methods need to handle various forms of masks such as thin or thick brushes, squares, or even extreme masks where the vast ma-jority of the image is missing.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

RePaint: Inpainting using Denoising Diffusion Probabilistic Models

Lugmayr¹,

Danelljan²,

Romero³

et al. 2022

Preprint

View full text Add to dashboard Cite

Figure 1. We use Denoising Diffusion Probabilistic Models (DDPM) for inpainting. The process is conditioned on the masked input (left). It starts from a random Gaussian noise sample that is iteratively denoised until it produces a high-quality output. Since this process is stochastic, we can sample multiple diverse outputs. The DDPM prior forces a harmonized image, is able to reproduce texture from other regions, and inpaint semantically meaningful content.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Resamplingmentioning

confidence: 99%

“…The autoregressive methods are DSI [27] and ICT [35], and the GAN methods are DeepFillv2 [40], AOT [44], and LaMa [33]. We use their publicly available pretrained models.…”

Section: Comparison With State-of-the-artmentioning

confidence: 99%

Section: Wide and Narrow Masksmentioning

confidence: 99%