2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020
DOI: 10.1109/cvpr42600.2020.00556
|View full text |Cite
|
Sign up to set email alerts
|

CookGAN: Causality Based Text-to-Image Synthesis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
26
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 59 publications
(26 citation statements)
references
References 16 publications
0
26
0
Order By: Relevance
“…Our proposed unified framework is implemented based on StyleGAN2-Ada [12]. As is shown in Table 1, we find our proposed CI-GAN 4.54 ± 0.07 -StackGAN++ [37] 5.03 ± 0.09 -CookGAN [38] 5.41 ± 0.11 -CI-GAN (Ours)…”
Section: Implementation Detailsmentioning
confidence: 95%
“…Our proposed unified framework is implemented based on StyleGAN2-Ada [12]. As is shown in Table 1, we find our proposed CI-GAN 4.54 ± 0.07 -StackGAN++ [37] 5.03 ± 0.09 -CookGAN [38] 5.41 ± 0.11 -CI-GAN (Ours)…”
Section: Implementation Detailsmentioning
confidence: 95%
“…This domain gap can be bridged by additionally learning or fine-tuning the text encoder (or parts of it) during the generative model training, however, due to the complexity of learning an effective decoder, this might result in sub-optimal cross-modal textual representations. Recently, cross-domain retrieval and synthesis frameworks have attempted to alleviate this, particularly, for complex cooking recipe descriptions [10,37,45,46]. These last methods can be spilt into joint [37,46], and separate [10,45] 3 , embedding and synthesis.…”
Section: Cross-modal Synthesismentioning
confidence: 99%
“…Recently, cross-domain retrieval and synthesis frameworks have attempted to alleviate this, particularly, for complex cooking recipe descriptions [10,37,45,46]. These last methods can be spilt into joint [37,46], and separate [10,45] 3 , embedding and synthesis. The method proposed here is closely related to these methods, however, it significantly differs on how the conditional information is generated, as stated above.…”
Section: Cross-modal Synthesismentioning
confidence: 99%
See 2 more Smart Citations