Food ontologies require significant effort to create and maintain as they involve manual and time-consuming tasks, often with limited alignment to the underlying food science knowledge. We propose a semi-supervised framework for the automated ontology population from an existing ontology scaffold by using word embeddings. Having applied this on the domain of food and subsequent evaluation against an expert-curated ontology, FoodOn, we observe that the food word embeddings capture the latent relationships and characteristics of foods. The resulting ontology, which utilizes word embeddings trained from the Wikipedia corpus, has an improvement of 89.7% in precision when compared to the expert-curated ontology FoodOn (0.34 vs. 0.18, respectively, p value = 2.6 × 10–138), and it has a 43.6% shorter path distance (hops) between predicted and actual food instances (2.91 vs. 5.16, respectively, p value = 4.7 × 10–84) when compared to other methods. This work demonstrates how high-dimensional representations of food can be used to populate ontologies and paves the way for learning ontologies that integrate contextual information from a variety of sources and types.
The computational modeling of food processing, aimed at various applications including industrial automation, robotics, food safety, preservation, energy conservation, and recipe nutrition estimation, has been ongoing for decades within food science research labs, industry, and regulatory agencies. The datasets from this prior work have the potential to advance the field of data-driven modeling if they can be harmonized, but this requires a standardized language as a starting point. Our primary goal is to explore two interdependent aspects of this language: the granularity of process modeling sub-parts and parameter details and the substitution of compatible inputs and processes. A delicate semantic distinction—categorizing planned processes based on the objectives they seek to fulfill vs. categorizing them by the actions or mechanisms they utilize—helps organize and facilitate this endeavor. To bring an ontological lens to process modeling, we employ the Open Biological and Biomedical Ontology Foundry ontological framework to organize two main classes of the FoodOn upper-level material processing hierarchy according to objective and mechanism, respectively. We include examples of material processing by mechanism, ranging from abstract ones such as “application of energy” down to specific classes such as “heating by microwave.” Similarly, material processing by objective—often a transformation to bring about materials with certain qualities or composition—can, for example, range from “material processing by heating threshold” to “steaming rice”.
The future of personalized health relies on knowledge of dietary composition. The current analytical methods are impractical to scale up, and the computational methods are inadequate. We propose machine learning models to predict the nutritional profiles of cooked foods given the raw food composition and cooking method, for a variety of plant and animal-based foods. Our models (trained on USDAs SR dataset) were on average 31% better than baselines, based on RMSE metric, and particularly good for leafy green vegetables and various cuts of beef. We also identified and remedied a bias in the data caused by representation of composition per 100grams. The scaling methods are based on a process-invariant nutrient, and the scaled data improves prediction performance. Finally, we advocate for an integrated approach of data analysis and modeling when generating future composition data to make the task more efficient, less costly and apply for development of reliable models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.