2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition 2018
DOI: 10.1109/cvpr.2018.00649
|View full text |Cite
|
Sign up to set email alerts
|

Photographic Text-to-Image Synthesis with a Hierarchically-Nested Adversarial Network

Abstract: This paper presents a novel method to deal with the challenging task of generating photographic images conditioned on semantic image descriptions. Our method introduces accompanying hierarchical-nested adversarial objectives inside the network hierarchies, which regularize mid-level representations and assist generator training to capture the complex image statistics. We present an extensile single-stream generator architecture to better adapt the jointed discriminators and push generated images up to high res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
251
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 273 publications
(251 citation statements)
references
References 48 publications
(86 reference statements)
0
251
0
Order By: Relevance
“…For fair comparison, the multi-scale resolution settings are the same as used with the experiments on the CelebA dataset. In particular, we use 32 × 32 and 64 × 64 resolution scales for StackGAN [15] training and 16 × 16, 32 × 32 and 64 × 64 multiple resolution scales for HDGAN [17] and StackGAN++ [16] as well as our proposed method. The disCVAE method produces reconstructions which are blurry.…”
Section: B Lfw Dataset Resultsmentioning
confidence: 99%
See 4 more Smart Citations
“…For fair comparison, the multi-scale resolution settings are the same as used with the experiments on the CelebA dataset. In particular, we use 32 × 32 and 64 × 64 resolution scales for StackGAN [15] training and 16 × 16, 32 × 32 and 64 × 64 multiple resolution scales for HDGAN [17] and StackGAN++ [16] as well as our proposed method. The disCVAE method produces reconstructions which are blurry.…”
Section: B Lfw Dataset Resultsmentioning
confidence: 99%
“…Previous conditional GAN-based approaches such as GAN-INT-CLS [14] and StackGAN [15] also produce poor quality results due to the model collapse during training. Recent StackGAN++ and HDGAN works generate plausible facial images (HDGAN is better at the color Methods FID score HDGAN [17] 114.912 StackGAN++ [16] 35.988 Single-scale (proposed method) 37.381 Proposed method 30.566 diversity). The previous work Attribute2Sketch2Face, which is a combination of CVAE and GAN, is also able to generate facial images with corresponding attributes.…”
Section: B Lfw Dataset Resultsmentioning
confidence: 99%
See 3 more Smart Citations