Our proof-of-concept study develops a suspect screening workflow to identify and prioritize potentially ubiquitous chemical exposures in matched maternal/cord blood samples, a critical period of development for future health risks. We applied liquid chromatography−quadrupole time-of-flight tandem mass spectrometry (LC-QTOF/MS) to perform suspect screening for ∼3500 industrial chemicals on pilot data from 30 paired maternal and cord serum samples (n = 60). We matched 662 suspect features in positive ionization mode and 788 in negative ionization mode (557 unique formulas overall) to compounds in our database, and selected 208 of these for fragmentation analysis based on detection frequency, correlation in feature intensity between maternal and cord samples, and peak area differences by demographic characteristics. We tentatively identified 73 suspects through fragmentation spectra matching and confirmed 17 chemical features (15 unique compounds) using analytical standards. We tentatively identified 55 compounds not previously reported in the literature, the majority which have limited to no information about their sources or uses. Examples include (i) 1-(1acetyl-2,2,6,6-tetramethylpiperidin-4-yl)-3-dodecylpyrrolidine-2,5-dione (known high production volume chemical) (ii) methyl perfluoroundecanoate and 2-perfluorooctyl ethanoic acid (two PFAS compounds); and (iii) Sumilizer GA 80 (plasticizer). Thus, our workflow demonstrates an approach to evaluating the chemical exposome to identify and prioritize chemical exposures during a critical period of development.
Recent technological advances in mass spectrometry have enabled us to screen biological samples for a very broad spectrum of chemical compounds allowing us to more comprehensively characterize the human exposome in critical periods of development. The goal of this study was three-fold: 1) to analyze 590 matched maternal and cord blood samples (total 295 pairs) using non-targeted analysis (NTA); 2) examine the differences in chemical abundance between maternal and cord blood samples; and 3) examine the associations between exogenous chemicals and endogenous metabolites. We analyzed all samples with high-resolution mass spectrometry (HRMS) using liquid chromatography -quadrupole time-of-flight mass spectrometry (LC-QTOF/MS), in both positive and negative electrospray ionization modes (ESI+ and ESI-) and in soft ionization (MS) and fragmentation (MS/MS) modes for prioritized features. We confirmed 19 unique compounds with analytical standards, we tentatively identified 73 compounds with MS/MS spectra matching, and we annotated 98 compounds using an annotation algorithm. We observed 103 significant associations in maternal and 128 in cord samples between compounds annotated as endogenous and compounds annotated as exogenous. An example of these relationships was an association between 3 poly and perfluoroalkyl substances (PFAS) and endogenous fatty acids in both the maternal and cord samples indicating potential interactions between PFAS and fatty acid regulating proteins. File list (2) download file view on ChemRxiv Paper R01 v9.pdf (3.65 MiB) download file view on ChemRxiv Supporting Information v8.pdf (3.05 MiB)
Non-targeted analysis provides a comprehensive approach to analyze environmental and biological samples for nearly all chemicals present. One of the main shortcomings of current analytical methods and workflows is that they are unable to provide any quantitative information constituting an important obstacle in understanding environmental fate and human exposure. Herein, we present an in silico quantification method using mahine-learning for chemicals analyzed using electrospray ionization (ESI). We considered three data sets from different instrumental setups: (i) capillary electrophoresis electrospray ionization-mass spectrometry (CE-MS) in positive ionization mode (ESI+), (ii) liquid chromatography quadrupole time-of-flight mass spectrometry (LC-QTOF/MS) in ESI+ and (iii) LC-QTOF/MS in negative ionization mode (ESI−). We developed and applied two different machine-learning algorithms: a random forest (RF) and an artificial neural network (ANN) to predict the relative response factors (RRFs) of different chemicals based on their physicochemical properties. Chemical concentrations can then be calculated by dividing the measured abundance of a chemical, as peak area or peak height, by its corresponding RRF. We evaluated our models and tested their predictive power using 5-fold cross-validation (CV) and y randomization. Both the RF and the ANN models showed great promise in predicting RRFs. However, the accuracy of the predictions was dependent on the data set composition and the experimental setup. For the CE-MS ESI+ data set, the best model predicted measured RRFs with a mean absolute error (MAE) of 0.19 log units and a cross-validation coefficient of determination (Q 2) of 0.84 for the testing set. For the LC-QTOF/MS ESI+ data set, the best model predicted measured RRFs with an MAE of 0.32 and a Q 2 of 0.40. For the LC-QTOF/MS ESI– data set, the best model predicted measured RRFs with a MAE of 0.50 and a Q 2 of 0.20. Our findings suggest that machine-learning algorithms can be used for predicting concentrations of nontargeted chemicals with reasonable uncertainties, especially in ESI+, while the application on ESI– remains a more challenging problem.
Uncertainties in the physicochemical properties of volatile methylsiloxanes have resulted in substantial uncertainties in calculations of concentrations persistence. Choosing the right set of properties seems crucial for making accurate predictions.
While important advances have been made in high-resolution mass spectrometry (HRMS) and its applications in non-targeted analysis (NTA), the number of identified compounds in biological and environmental samples often does not exceed 5% of the detected chemical features. Our aim was to develop a computational pipeline that leverages data from HRMS but also incorporates physicochemical properties (equilibrium partition ratios between organic solvents and water; K solvent–water) and can propose molecular structures for detected chemical features. As these physicochemical properties are often sufficiently different across isomers, when put together, they can form a unique profile for each isomer, which we describe as the “physicochemical fingerprint”. In our study, we used a comprehensive database of compounds that have been previously reported in human blood and collected their K solvent–water values for 129 partitioning systems. We used RDKit to calculate the number of RDKit fragments and the number of RDKit bits per molecule. We then developed and trained an artificial neural network, which used as an input the physicochemical fingerprint of a chemical feature and predicted the number and types of RDKit fragments and RDKit bits present in that structure. These were then used to search the database and propose chemical structures. The average success rate of predicting the right chemical structure ranged from 60 to 86% for the training set and from 48 to 81% for the testing set. These observations suggest that physicochemical fingerprints can assist in the identification of compounds with NTA and substantially improve the number of identified compounds.
Recent technological advances in mass spectrometry have enabled us to screen biological samples for a very broad spectrum of chemical compounds allowing us to more comprehensively characterize the human exposome in critical periods of development. The goal of this study was three-fold: 1) to analyze 590 matched maternal and cord blood samples (total 295 pairs) using non-targeted analysis (NTA); 2) examine the differences in chemical abundance between maternal and cord blood samples; and 3) examine the associations between exogenous chemicals and endogenous metabolites. We analyzed all samples with high-resolution mass spectrometry (HRMS) using liquid chromatography – quadrupole time-of-flight mass spectrometry (LC-QTOF/MS), in both positive and negative electrospray ionization modes (ESI+ and ESI-) and in soft ionization (MS) and fragmentation (MS/MS) modes for prioritized features. We confirmed 19 unique compounds with analytical standards, we tentatively identified 73 compounds with MS/MS spectra matching, and we annotated 98 compounds using an annotation algorithm. We observed 103 significant associations in maternal and 128 in cord samples between compounds annotated as endogenous and compounds annotated as exogenous. An example of these relationships was an association between 3 poly and perfluoroalkyl substances (PFAS) and endogenous fatty acids in both the maternal and cord samples indicating potential interactions between PFAS and fatty acid regulating proteins.
The exposome has been recognized as an important dimension in understanding human disease and complementing the genome but remains largely uncharacterized. We analyzed 295 matched maternal and cord blood samples using non-targeted high-resolution mass spectrometry and characterized exposome features. We compared the chemical enrichment of the maternal and cord blood samples using a similarity network analysis and examined the interactions between the exogenous and the endogenous chemical features using a molecular interaction networks approach. We detected over 700 chemical features in the maternal and cord pairs and we found that maternal samples are more similar in terms of chemical enrichment to their corresponding cord samples compared to other maternal samples or other cord samples. We observed significant associations between 3 poly/perfluoroalkyl substances (PFAS) and endogenous fatty acids in both the maternal and cord samples indicating important interactions between PFAS and fatty acid regulating proteins. To our knowledge, this is the first non-targeted analysis study that uses such large cohort to characterize the prenatal exposome.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.