2021
DOI: 10.48550/arxiv.2112.13592
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multimodal Image Synthesis and Editing: A Survey

Abstract: As information exists in various modalities in real world, effective interaction and fusion among multimodal information plays a key role for the creation and perception of multimodal data in computer vision and deep learning research. With superb power in modelling the interaction among multimodal information, multimodal image synthesis and editing have become a hot research topic in recent years. Different from traditional visual guidance which provides explicit clues, multimodal guidance offers intuitive an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
4

Relationship

3
6

Authors

Journals

citations
Cited by 13 publications
(15 citation statements)
references
References 155 publications
(288 reference statements)
0
10
0
Order By: Relevance
“…Meanwhile, the advances of deep learning provide powerful tools such as CNN to process image data. The image editing task aims to generate a new image from a source image by editing the contents of the source image under certain guidance while keeping other properties unchanged [265]. The input and output of the model for image editing are the images represented by the pixel matrix with multiple color channels.…”
Section: Image Editingmentioning
confidence: 99%
“…Meanwhile, the advances of deep learning provide powerful tools such as CNN to process image data. The image editing task aims to generate a new image from a source image by editing the contents of the source image under certain guidance while keeping other properties unchanged [265]. The input and output of the model for image editing are the images represented by the pixel matrix with multiple color channels.…”
Section: Image Editingmentioning
confidence: 99%
“…Image Generation Loss Image generation tasks entail various losses to achieve dedicated purposes in image synthesis [23,24,32,39,40,43,44,[47][48][49]. For instance, unpaired image translation is usually associated with certain losses to encourage correlation between the input and output images.…”
Section: Related Workmentioning
confidence: 99%
“…Due to the superior generation capability, GAN-based image-to-image translation [25, 34-38, 44, 52] has been extensively investigated and achieved remarkable progress on translating different conditions such as semantic segmentation [10,25,32,41,43], key points [20,22,40,42] and edge maps [15,39,53].…”
Section: Image-to-image Translationmentioning
confidence: 99%