We have acquired panelist data that provides hedonic (liking) ratings for a set of 40 flavors each composed of the same 7 ingredients at different concentration levels. Our goal is to use this data to predict other flavors, composed of the same ingredients in new combinations, which the panelist will like. We describe how we first employ Pareto-Genetic Programming (GP) to generate a surrogate for the human panelist from the 40 observations. This surrogate, in fact an ensemble of GP symbolic regression models, can predict liking scores for flavors outside the observations and provide a confidence in the prediction. We then employ a multi-objective particle swarm optimization (MOPSO) to design a well and consistently liked flavor suite for a panelist. The MOPSO identifies flavors that are well liked (high liking score), and consistently-liked (of maximum confidence). Further, we generate flavors that are well and consistently liked by a cluster of panelists, by giving the MOPSO slightly different objectives.
This paper presents a novel approach for knowledge mining from a sparse and repeated measures dataset. Genetic programming based symbolic regression is employed to generate multiple models that provide alternate explanations of the data. This set of models, called an ensemble, is generated for each of the repeated measures separately. These multiple ensembles are then utilized to generate information about, (a) which variables are important in each ensemble, (b) cluster the ensembles into different groups that have similar variables that drive their response variable, and (c) measure sensitivity of response with respect to the important variables. We apply our methodology to a sensory science dataset. The data contains hedonic evaluations (liking scores), assigned by a diverse set of human testers, for a small set of flavors composed from seven ingredients. Our approach: (1) identifies the important ingredients that drive the liking score of a panelist and (2) segments the panelists into groups that are driven by the same ingredient, and (3) enables flavor scientists to perform the sensitivity analysis of liking scores relative to changes in the levels of important ingredients.
Abstract. We describe a data mining framework that derives panelist information from sparse flavour survey data. One component of the framework executes genetic programming ensemble based symbolic regression. Its evolved models for each panelist provide a second component with all plausible and uncorrelated explanations of how a panelist rates flavours. The second component bootstraps the data using an ensemble selected from the evolved models, forms a probability density function for each panelist and clusters the panelists into segments that are easy to please, neutral, and hard to please.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.