2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.01070
|View full text |Cite
|
Sign up to set email alerts
|

Inverse Cooking: Recipe Generation From Food Images

Abstract: People enjoy food photography because they appreciate food. Behind each meal there is a story described in a complex recipe and, unfortunately, by simply looking at a food image we do not have access to its preparation process. Therefore, in this paper we introduce an inverse cooking system that recreates cooking recipes given food images. Our system predicts ingredients as sets by means of a novel architecture, modeling their dependencies without imposing any order, and then generates cooking instructions by … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
174
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 128 publications
(176 citation statements)
references
References 39 publications
0
174
0
2
Order By: Relevance
“…[Wang et al 2019;Zhu et al 2019] further introduced adversarial networks to impose the modality alignment for cross-modal retrieval. [Salvador et al 2019] proposed a new architecture for ingredient prediction that exploits co-dependencies among ingredients without imposing order for generating cooking instructions from an image and its ingredients.…”
Section: Referencementioning
confidence: 99%
“…[Wang et al 2019;Zhu et al 2019] further introduced adversarial networks to impose the modality alignment for cross-modal retrieval. [Salvador et al 2019] proposed a new architecture for ingredient prediction that exploits co-dependencies among ingredients without imposing order for generating cooking instructions from an image and its ingredients.…”
Section: Referencementioning
confidence: 99%
“…Some researchers have been tackling problems to generate a procedural text from various inputs. In cooking domain, Salvador et al (2019) tried to generate a recipe from an image of a complete dish. Bosselut et al (2018) and Kiddon et al assumed a title and ingredients as the input.…”
Section: Related Workmentioning
confidence: 99%
“…For this reason procedural text generation seemed to be difficult and was initially solved by formulating it as a retrieval task (Salvador et al, 2017;Zhu et al, 2019;Chen and Ngo, 2016). Another similar task setting is recipe generation from a photo of the final dish using ingredient predictor (Salvador et al, 2019). This setting may be, however, very difficult or even impossible because a single photo of the final dish does not contain sufficient information for its production procedure.…”
Section: Introductionmentioning
confidence: 99%
“…Extracting the quality and quantity of recipes and ingredients [29,32] is a key precursor in many application areas of food computing, including healthy recommendation [33]. The multi-modal aspect of recipes has shown promise in enhancing cooking procedure understanding [40] by using auxiliary data such as video [22,30] or images [27,42]. The existing work on leveraging these modalities for recommendation [17,34,35] uses established pre-trained image models or specific image features (including measures of sharpness, contrast).…”
Section: Introduction and Related Workmentioning
confidence: 99%