The identification of xenobiotics
in nontargeted metabolomic analyses
is a vital step in understanding human exposure. Xenobiotic metabolism,
transformation, excretion, and coexistence with other endogenous molecules,
however, greatly complicate the interpretation of features detected
in nontargeted studies. While mass spectrometry (MS)-based platforms
are commonly used in metabolomic measurements, deconvoluting endogenous
metabolites from xenobiotics is also often challenged by the lack
of xenobiotic parent and metabolite standards as well as the numerous
isomers possible for each small molecule m/z feature. Here, we evaluate a xenobiotic structural annotation
workflow using ion mobility spectrometry coupled with MS (IMS–MS),
mass defect filtering, and machine learning to uncover potential xenobiotic
classes and species in large metabolomic feature lists. Xenobiotic
classes examined included those of known high toxicities, including
per- and polyfluoroalkyl substances (PFAS), polycyclic aromatic hydrocarbons
(PAHs), polychlorinated biphenyls (PCBs), polybrominated diphenyl
ethers (PBDEs), and pesticides. Specifically, when the workflow was
applied to identify PFAS in the NIST SRM 1957 and 909c human serum
samples, it greatly reduced the hundreds of detected liquid chromatography
(LC)–IMS–MS features by utilizing both mass defect filtering
and m/z versus IMS collision cross
sections relationships. These potential PFAS features were then compared
to the EPA CompTox entries, and while some matched within specific m/z tolerances, there were still many unknowns
illustrating the importance of nontargeted studies for detecting new
molecules with known chemical characteristics. Additionally, this
workflow can also be utilized to evaluate other xenobiotics and enable
more confident annotations from nontargeted studies.
Metabolite annotation continues to be the widely accepted bottleneck in nontargeted metabolomics workflows. Annotation of metabolites typically relies on a combination of high-resolution mass spectrometry (MS) with parent and tandem measurements, isotope cluster evaluations, and Kendrick mass defect (KMD) analysis. Chromatographic retention time matching with standards is often used at the later stages of the process, which can also be followed by metabolite isolation and structure confirmation utilizing nuclear magnetic resonance (NMR) spectroscopy. The measurement of gas-phase collision cross-section (CCS) values by ion mobility (IM) spectrometry also adds an important dimension to this workflow by generating an additional molecular parameter that can be used for filtering unlikely structures. The millisecond timescale of IM spectrometry allows the rapid measurement of CCS values and allows easy pairing with existing MS workflows. Here, we report on a highly accurate machine learning algorithm (CCSP 2.0) in an open-source Jupyter Notebook format to predict CCS values based on linear support vector regression models. This tool allows customization of the training set to the needs of the user, enabling the production of models for new adducts or previously unexplored molecular classes. CCSP produces predictions with accuracy equal to or greater than existing machine learning approaches such as CCSbase, DeepCCS, and AllCCS, while being better aligned with FAIR (Findable, Accessible, Interoperable, and Reusable) data principles. Another unique aspect of CCSP 2.0 is its inclusion of a large library of 1613 molecular descriptors via the Mordred Python package, further encoding the fine aspects of isomeric molecular structures. CCS prediction accuracy was tested using CCS values in the McLean CCS Compendium with median relative errors of 1.25, 1.73, and 1.87% for the 170 [M − H] − , 155 [M + H] + , and 138 [M + Na] + adducts tested. For superclass-matched data sets, CCS predictions via CCSP allowed filtering of 36.1% of incorrect structures while retaining a total of 100% of the correct annotations using a Δ CCS threshold of 2.8% and a mass error of 10 ppm.
Environmental analysis of xenobiotics is a challenging yet necessary undertaking to characterize pollution levels, assess the effectiveness of remediation interventions, and prevent adverse environmental and health outcomes. Xenobiotics are concerning from an environmental perspective due to their chemical persistence, toxicity to humans and wildlife, and prolific use in agricultural and industrial applications. 1 Many xenobiotics are persistent organic pollutants (POPs), and the number of POPs listed in the Stockholm Convention is
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.