2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01174
|View full text |Cite
|
Sign up to set email alerts
|

R²GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
114
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 114 publications
(115 citation statements)
references
References 27 publications
1
114
0
Order By: Relevance
“…Finally, as our model is specifically design for food recognition, a rich set of domain discriminators and regularizers are considered as loss functions. These include multi-label classification of ingredients as semantic regularizer as in [33], image and text domain discriminators as in [12], and reconstruction of images from recipes for shared representation learning [33,36].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, as our model is specifically design for food recognition, a rich set of domain discriminators and regularizers are considered as loss functions. These include multi-label classification of ingredients as semantic regularizer as in [33], image and text domain discriminators as in [12], and reconstruction of images from recipes for shared representation learning [33,36].…”
Section: Related Workmentioning
confidence: 99%
“…Based upon these prior works [4,7,9,29,33,36], this paper extends from cross-modal to cross-domain food retrieval. Leveraging on image-recipe pairs in a source domain, we consider the problem of food transfer as recognizing food in a target domain with new food categories and attributes.…”
Section: Introductionmentioning
confidence: 99%
“…[Micael et al 2018] extended [Salvador et al 2017] by providing a double-triplet strategy to jointly express both the retrieval loss and the classification one for cross-modal retrieval. [Wang et al 2019;Zhu et al 2019] further introduced adversarial networks to impose the modality alignment for cross-modal retrieval. [Salvador et al 2019] proposed a new architecture for ingredient prediction that exploits co-dependencies among ingredients without imposing order for generating cooking instructions from an image and its ingredients.…”
Section: Referencementioning
confidence: 99%
“…Recipe1M [27] is the only large-scale food dataset with English recipes and images publicly available. Many related works [6,26,27,32] are based on this dataset. The raw dataset contains more than 1 million recipes and almost 900k images.…”
Section: Experiments 41 Datasetsmentioning
confidence: 99%
“…People tend to spend much time on recipes because cooking is closely related to our life. Lots of words have been done to deconstruct and understand food, including food classification [8,16], recipe-image embedding [6,27,32] and image-to-recipe generation [26]. Furthermore, dish appearance visualization in advance will be of great help for designing new recipes, which provides evident significance to image generation from given recipes.…”
Section: Introductionmentioning
confidence: 99%