Semantic Object Accuracy for Generative Text-to-Image Synthesis

Hinz, Tobias; Heinrich, S.; Wermter, Stefan

doi:10.1109/tpami.2020.3021209

Cited by 93 publications

(87 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the field of text-to-image synthesis, Hinz et al [61] introduce Semantic Object Accuracy (SOA) to evaluate images given an image caption. LayoutGAN is proposed by Li et al [62] for graphic design and scene generation, introducing wireframe rendering for image discrimination.…”

Section: B Gui Generationmentioning

confidence: 99%

GUIGAN: Learning to Generate GUI Designs Using Generative Adversarial Networks

Zhao

Chen

Liu

et al. 2021

2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)

View full text Add to dashboard Cite

Graphical User Interface (GUI) is ubiquitous in almost all modern desktop software, mobile applications, and online websites. A good GUI design is crucial to the success of the software in the market, but designing a good GUI which requires much innovation and creativity is difficult even to well-trained designers. Besides, the requirement of the rapid development of GUI design also aggravates designers' working load. So, the availability of various automated generated GUIs can help enhance the design personalization and specialization as they can cater to the taste of different designers. To assist designers, we develop a model GUIGAN to automatically generate GUI designs. Different from conventional image generation models based on image pixels, our GUIGAN is to reuse GUI components collected from existing mobile app GUIs for composing a new design that is similar to natural-language generation. Our GUIGAN is based on SeqGAN by modeling the GUI component style compatibility and GUI structure. The evaluation demonstrates that our model significantly outperforms the best of the baseline methods by 30.77% in Frechet Inception distance (FID) and 12.35% in 1-Nearest Neighbor Accuracy (1-NNA). Through a pilot user study, we provide initial evidence of the usefulness of our approach for generating acceptable brand new GUI designs.

show abstract

Section: B Gui Generationmentioning

confidence: 99%

GUIGAN: Learning to Generate GUI Designs Using Generative Adversarial Networks

Zhao

Chen

Liu

et al. 2021

2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE)

View full text Add to dashboard Cite

show abstract

“…In other words, if the model generates the same image, the FID will be higher (the lower the FID, the better), but IS can not penalize this case. [11,12,13] found that IS is not an appropriate metric to evaluate the text-to-image synthesis models since some models tend to generate the same image when the text contains the same word, which is not good generative models but IS could be high (the higher the IS, the better). Thus, we use FID to evaluate our models.…”

Section: Evaluation Detailsmentioning

confidence: 99%

“…Most existing works [8,9,10,11,12] have achieved remarkable progress by proposing effective structures of GANs. StackGAN [8] uses the stacked structure of multiple GANs to decompose the hard problem of generating highresolution images into tractable subproblems.…”

Section: Introductionmentioning

confidence: 99%

“…StackGAN [8] uses the stacked structure of multiple GANs to decompose the hard problem of generating highresolution images into tractable subproblems. Subsequent studies [9,10,11,12] have refined the architecture based on StackGAN [8]. AttnGAN [9] adopts cross-modal attention mechanisms for fine-grained generation.…”

Section: Introductionmentioning

confidence: 99%

“…DM-GAN [10] leverages dynamic memory modules to supplement the generation procedure. Obj-GAN [11] and OP-GAN [12] use an additional input, pre-generated scene layout, to concentrate on creating objects. However, it is not easy to train multiple GANs at one time.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

FA-GAN: Feature-Aware GAN for Text to Image Synthesis

Jeon

Kim

2021

2021 IEEE International Conference on Image Processing (ICIP)

View full text Add to dashboard Cite

Text-to-image synthesis aims to generate a photo-realistic image from a given natural language description. Previous works have made significant progress with Generative Adversarial Networks (GANs). Nonetheless, it is still hard to generate intact objects or clear textures (Fig 1). To address this issue, we propose Feature-Aware Generative Adversarial Network (FA-GAN) to synthesize a high-quality image by integrating two techniques: a self-supervised discriminator and a feature-aware loss. First, we design a self-supervised discriminator with an auxiliary decoder so that the discriminator can extract better representation. Secondly, we introduce a feature-aware loss to provide the generator more direct supervision by employing the feature representation from the self-supervised discriminator. Experiments on the MS-COCO dataset show that our proposed method significantly advances the state-of-the-art FID score from 28.92 to 24.58.

show abstract