Relationship between structural variants of enzymes and metabolic phenotypes in human population was investigated based on the association study of metabolite quantitative traits with whole genome sequence data for 512 individuals from a population cohort. We identified five significant associations between metabolites and non-synonymous variants. Four of these non-synonymous variants are located in enzymes involved in metabolic disorders, and structural analyses of these moderate non-synonymous variants demonstrate that they are located in peripheral regions of the catalytic sites or related regulatory domains. In contrast, two individuals with larger changes of metabolite levels were also identified, and these individuals retained rare variants, which caused non-synonymous variants located near the catalytic site. These results are the first demonstrations that variant frequency, structural location, and effect for phenotype correlate with each other in human population, and imply that metabolic individuality and susceptibility for diseases may be elicited from the moderate variants and much more deleterious but rare variants.
To solve major limitations in algorithms for the metabolite-based prediction of psychiatric phenotypes, a novel prediction model for depressive symptoms based on nonlinear feature selection machine learning, the Hilbert-Schmidt independence criterion least absolute shrinkage and selection operator (HSIC Lasso) algorithm, was developed and applied to a metabolomic dataset with the largest sample size to date. In total, 897 population-based subjects were recruited from the communities affected by the Great East Japan Earthquake; 306 metabolite features (37 metabolites identified by nuclear magnetic resonance measurements and 269 characterized metabolites based on the intensities from mass spectrometry) were utilized to build prediction models for depressive symptoms as evaluated by the Center for Epidemiologic Studies-Depression Scale (CES-D). The nested fivefold cross-validation was used for developing and evaluating the prediction models. The HSIC Lasso-based prediction model showed better predictive power than the other prediction models, including Lasso, support vector machine, partial least squares, random forest, and neural network. L-leucine, 3-hydroxyisobutyrate, and gamma-linolenyl carnitine frequently contributed to the prediction. We have demonstrated that the HSIC Lasso-based prediction model integrating nonlinear feature selection showed improved predictive power for depressive symptoms based on metabolome data as well as on risk metabolites based on nonlinear statistics in the Japanese population. Further studies should use HSIC Lasso-based prediction models with different ethnicities to investigate the generality of each risk metabolite for predicting depressive symptoms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.