Inverse Cooking: Recipe Generation From Food Images

Salvador, Amaia; Drozdzal, Michal; Giró-i-Nieto, Xavier; Romero, Adriana

doi:10.1109/cvpr.2019.01070

Cited by 128 publications

(176 citation statements)

References 39 publications

Supporting

Mentioning

174

Contrasting

Unclassified

Order By: Relevance

“…[Wang et al 2019;Zhu et al 2019] further introduced adversarial networks to impose the modality alignment for cross-modal retrieval. [Salvador et al 2019] proposed a new architecture for ingredient prediction that exploits co-dependencies among ingredients without imposing order for generating cooking instructions from an image and its ingredients.…”

Section: Referencementioning

confidence: 99%

A Survey on Food Computing

et al. 2019

View full text Add to dashboard Cite

Food is very essential for human life and it is fundamental to the human experience. Food-related study may support multifarious applications and services, such as guiding the human behavior, improving the human health and understanding the culinary culture.With the rapid development of social networks, mobile networks, and Internet of Things (IoT), people commonly upload, share, and record food images, recipes, cooking videos, and food diaries, leading to large-scale food data. Large-scale food data offers rich knowledge about food and can help tackle many central issues of human society. Therefore, it is time to group several disparate issues related to food computing. Food computing acquires and analyzes heterogenous food data from disparate sources for perception, recognition, retrieval, recommendation, and monitoring of food. In food computing, computational approaches are applied to address food related issues in medicine, biology, gastronomy and agronomy. Both large-scale food data and recent breakthroughs in computer science are transforming the way we analyze food data. Therefore, vast amounts of work has been conducted in the food area, targeting different food-oriented tasks and applications. However, there are very few systematic reviews, which shape this area well and provide a comprehensive and in-depth summary of current efforts or detail open problems in this area. In this paper, we formalize food computing and present such a comprehensive overview of various emerging concepts, methods, and tasks. We summarize key challenges and future directions ahead for food computing. This is the first comprehensive survey that targets the study of computing technology for the food area and also offers a collection of research studies and technologies to benefit researchers and practitioners working in different food-related fields.

show abstract

Section: Referencementioning

confidence: 99%

A Survey on Food Computing

et al. 2019

View full text Add to dashboard Cite

show abstract

“…Some researchers have been tackling problems to generate a procedural text from various inputs. In cooking domain, Salvador et al (2019) tried to generate a recipe from an image of a complete dish. Bosselut et al (2018) and Kiddon et al assumed a title and ingredients as the input.…”

Section: Related Workmentioning

confidence: 99%

“…For this reason procedural text generation seemed to be difficult and was initially solved by formulating it as a retrieval task (Salvador et al, 2017;Zhu et al, 2019;Chen and Ngo, 2016). Another similar task setting is recipe generation from a photo of the final dish using ingredient predictor (Salvador et al, 2019). This setting may be, however, very difficult or even impossible because a single photo of the final dish does not contain sufficient information for its production procedure.…”

Section: Introductionmentioning

confidence: 99%

Procedural Text Generation from a Photo Sequence

Nishimura

Hashimoto

Mori

2019

Proceedings of the 12th International Conference on Natural Language Generation

View full text Add to dashboard Cite

Multimedia procedural texts, such as instructions and manuals with pictures, support people to share how-to knowledge. In this paper, we propose a method for generating a procedural text given a photo sequence allowing users to obtain a multimedia procedural text. We propose a single embedding space both for image and text enabling to interconnect them and to select appropriate words to describe a photo. We implemented our method and tested it on cooking instructions, i.e., recipes. Various experimental results showed that our method outperforms standard baselines.

show abstract

“…Extracting the quality and quantity of recipes and ingredients [29,32] is a key precursor in many application areas of food computing, including healthy recommendation [33]. The multi-modal aspect of recipes has shown promise in enhancing cooking procedure understanding [40] by using auxiliary data such as video [22,30] or images [27,42]. The existing work on leveraging these modalities for recommendation [17,34,35] uses established pre-trained image models or specific image features (including measures of sharpness, contrast).…”

Section: Introduction and Related Workmentioning

confidence: 99%

Towards Multi-Language Recipe Personalisation and Recommendation

Twomey¹,

Fain²,

Ponikar³

et al. 2020

Fourteenth ACM Conference on Recommender Systems

View full text Add to dashboard Cite

Multi-language recipe personalisation and recommendation is an under-explored field of information retrieval in academic and production systems. The existing gaps in our current understanding are numerous, even on fundamental questions such as whether consistent and high-quality recipe recommendation can be delivered across languages. Motivated by this need, we consider the multi-language recipe recommendation setting and present grounding results that will help to establish the potential and absolute value of future work in this area. Our work draws on several billion events from millions of recipes, with published recipes and users incorporating several languages, including Arabic, English, Indonesian, Russian, and Spanish. We represent recipes using a combination of normalised ingredients, standardised skills and image embeddings obtained without human intervention. In modelling, we take a classical approach based on optimising an embedded bi-linear user-item metric space towards the interactions that most strongly elicit cooking intent. For users without interaction histories, a bespoke content-based cold-start model that predicts context and recipe affinity is introduced. We show that our approach to personalisation is stable and scales well to new languages. A robust cross-validation campaign is employed and consistently rejects baseline models and representations, strongly favouring those we propose. Our results are presented in a language-oriented (as opposed to model-oriented) fashion to emphasise the language-based goals of this work. We believe that this is the first large-scale work that evaluates the value and potential of multi-language recipe recommendation and personalisation.

show abstract

Inverse Cooking: Recipe Generation From Food Images

Cited by 128 publications

References 39 publications

A Survey on Food Computing

A Survey on Food Computing

Procedural Text Generation from a Photo Sequence

Towards Multi-Language Recipe Personalisation and Recommendation

Contact Info

Product

Resources

About