Proceedings of the 29th ACM International Conference on Multimedia 2021
DOI: 10.1145/3474085.3475436
|View full text |Cite
|
Sign up to set email alerts
|

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
52
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 88 publications
(61 citation statements)
references
References 30 publications
2
52
0
Order By: Relevance
“…Image Generation Loss Image generation tasks entail various losses to achieve dedicated purposes in image synthesis [23,24,32,39,40,43,44,[47][48][49]. For instance, unpaired image translation is usually associated with certain losses to encourage correlation between the input and output images.…”
Section: Related Workmentioning
confidence: 99%
“…Image Generation Loss Image generation tasks entail various losses to achieve dedicated purposes in image synthesis [23,24,32,39,40,43,44,[47][48][49]. For instance, unpaired image translation is usually associated with certain losses to encourage correlation between the input and output images.…”
Section: Related Workmentioning
confidence: 99%
“…For example, Wan et al [33] propose the first transformer based image inpainting method to get the image prior and send the image prior to a CNN. To incorporate the image prior, the approach of [37] designs a bidirectional and autoregressive transformer. More recently, Input Ours LaMa GT GT Spectra Ours LaMa Spectra Figure 3.…”
Section: Related Work 21 Image Inpaintingmentioning
confidence: 99%
“…Recent years, most state-of-the-art approaches are mainly based on convolutional neural networks or transformer. In the approaches of [22,35,38,40], they apply the convolutional neural networks for image inpainting, while other line of research [33,37] leverages the transformer in image inpainting at the low-resolution image space, and then introduces the GAN based networks for high quality image generation. Suvorov et al [31] utilize the Fast Fourier Convolution (FFC) instead of regular convolution to obtain features of global receptive fields in frequency domain.…”
Section: Introductionmentioning
confidence: 99%
“…Nevertheless, the above approaches expose a common drawback in recovering the image global structure. Therefore, many studies improve the network to better recover the global structure by introducing relevant structural priors [4,23,25,31,43]. However, these low-level structural priors are difficult to obtain, under the large corrupted regions.…”
Section: Introductionmentioning
confidence: 99%
“…• We propose InCo 2 Loss, a pair of similarity based losses to further improve the inter-coordination between the corrupted and non-corrupted regions and the intra-coordination in corrupted regions. [43] employ autoregressive transformers to inpaint diverse faces. However, these methods generally ignore the modeling of the facial internal correlations, and limit the refinement of the specific facial semantic regions.…”
Section: Introductionmentioning
confidence: 99%