2021
DOI: 10.48550/arxiv.2107.12579
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Remember What You have drawn: Semantic Image Manipulation with Memory

Abstract: Image manipulation with natural language, which aims to manipulate images with the guidance of language descriptions, has been a challenging problem in the fields of computer vision and natural language processing (NLP). Currently, a number of efforts have been made for this task, but their performances are still distant away from generating realistic and text-conformed manipulated images. Therefore, in this paper, we propose a memory-based Image Manipulation Network (MIM-Net), where a set of memories learned … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 21 publications
0
3
0
Order By: Relevance
“…Semantic segmentation [29,35,15,14] is a task of classifying each pixel in an image into a specified category and has been applied in various fields [30,24]. State-of-the-art segmentation methods are usually based on the Fully Convolutional Network (FCN) [18], which uses a classification network as the backbone and replaces fully connected layers with convolutional layers to predict the dense segmentation map.…”
Section: Semantic Segmentationmentioning
confidence: 99%
“…Semantic segmentation [29,35,15,14] is a task of classifying each pixel in an image into a specified category and has been applied in various fields [30,24]. State-of-the-art segmentation methods are usually based on the Fully Convolutional Network (FCN) [18], which uses a classification network as the backbone and replaces fully connected layers with convolutional layers to predict the dense segmentation map.…”
Section: Semantic Segmentationmentioning
confidence: 99%
“…Currently, state-of-the-art methods handle image semantic segmentation as a dense prediction task and adopt fully convolutional networks to make predictions [26], [27]. To make pixel-level dense predictions, encoder-decoder structures [28], [29], [30], [31], [17], [32], [33], [34], [35], [36], [37], [38], [39] are widely used to reconstruct high-resolution prediction maps. Typically an encoder gradually downsamples the feature maps, aiming to acquire large field-of-view and capture the semantic object information.…”
Section: A Semantic Segmentationmentioning
confidence: 99%
“…Semantic segmentation [58][59][60][61][62] is a task of classifying each pixel in an image into a specified category and has been applied in various fields [63][64][65]. State-of-theart segmentation methods are usually based on the Fully Convolutional Network 2.3.…”
Section: Image Segmentation With Limited Supervision 231 Semantic Seg...mentioning
confidence: 99%