The potential of analytical chemistry to predict sensory qualities of food materials is a major current theme. Standard practice is cross-validation (CV), where a set of chemical and associated sensory data is partitioned so chemometric models can be developed on training subsets, and validated on held-out subsets. CV demonstrates prediction, but is an unlikely scenario for industrial operations, where concomitant data acquisition for model development and test materials would be unwieldy. We evaluated cocoa materials of diverse provenance, and analyzed on different dates to those used in model development. Liquor extracts were analyzed by flow-injection electrospray-mass spectrometry (FIE-MS), a novel method for sensory quality prediction. FIE-MS enabled prediction of sensory qualities described by trained human panelists. Optimal models came from the Weka data-mining algorithm SimpleLinearRegression, which learns a model for the attribute giving minimal training error, which was (-)-epicatechin. This flavonoid likewise dominated partial least-squares (PLS)-regression models. Refinements of PLS (orthogonal-PLS or orthogonal signal correction) gave poorer generalization to different test sets, as did support vector machines, whose hyperparameters could not be optimized in training to avoid overfitting. In conclusion, if chemometric overfitting is avoided, chemical analysis can predict sensory qualities of food materials under operationally realistic conditions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.