Identification of unknowns is a bottleneck for large-scale untargeted analyses like metabolomics or drug metabolite identification. Ion mobility-mass spectrometry (IM-MS) provides rapid two-dimensional separation of ions based on their mobility through a neutral buffer gas. The mobility of an ion is related to its collision cross section (CCS) with the buffer gas, a physical property that is determined by the size and shape of the ion. This structural dependency makes CCS a promising characteristic for compound identification, but this utility is limited by the availability of high-quality reference CCS values. CCS prediction using machine learning (ML) has recently shown promise in the field, but accurate and broadly applicable models are still lacking. Here we present a novel ML approach that employs a comprehensive collection of CCS values covering a wide range of chemical space. Using this diverse database, we identified the structural characteristics, represented by molecular quantum numbers (MQNs), that contribute to variance in CCS and assessed the performance of a variety of ML algorithms in predicting CCS. We found that by breaking down the chemical structural diversity using unsupervised clustering based on the MQNs, specific and accurate prediction models for each cluster can be trained, which showed superior performance than a single model trained with all data. Using this approach, we have robustly trained and characterized a CCS prediction model with high accuracy on diverse chemical structures. An all-in-one web interface () was built for querying the CCS database and accessing the predictive model to support unknown compound identifications.
Ion mobility-mass spectrometry (IM-MS) can provide orthogonal information, i.e., m/z and collision cross section (CCS), for the identification of drugs and drug metabolites. However, only a small number of CCS values are available for drugs, which limits the use of CCS as an identification parameter and the assessment of structure–function relationships of drugs using IM-MS. Here, we report the development of a rapid workflow for the measurement of CCS values of a large number of drug or drug-like molecules in nitrogen on the widely available traveling wave IM-MS (TWIM-MS) platform. Using a combination of small molecule and polypeptide CCS calibrants, we successfully determined the nitrogen CCS values of 1425 drug or drug-like molecules in the MicroSource Discovery Systems’ Spectrum Collection using flow injection analysis of 384-well plates. Software was developed to streamline data extraction, processing, and calibration. We found that the overall drug collection covers a wide CCS range for the same mass, suggesting a large structural diversity of these drugs. However, individual drug classes appear to occupy a narrow and unique space in the CCS–mass 2D spectrum, suggesting a tight structure–function relationship for each class of drugs with a specific target. We observed bimodal distributions for several antibiotic species due to multiple protomers, including the known fluoroquinolone protomers and the new finding of cephalosporin protomers. Lastly, we demonstrated the utility of the high-throughput method and drug CCS database by quickly and confidently confirming the active component in a pharmaceutical product.
Comprehensive profiling of lipid species in a biological sample, or lipidomics, is a valuable approach to elucidating disease pathogenesis and identifying biomarkers. Currently, a typical lipidomics experiment may track hundreds to thousands of individual lipid species. However, drawing biological conclusions requires multiple steps of data processing to enrich significantly altered features and confident identification of these features. Existing solutions for these data analysis challenges (i.e., multivariate statistics and lipid identification) involve performing various steps using different software applications, which imposes a practical limitation and potentially a negative impact on reproducibility. Hydrophilic interaction liquid chromatography-ion mobility-mass spectrometry (HILIC-IM-MS) has shown advantages in separating lipids through orthogonal dimensions. However, there are still gaps in the coverage of lipid classes in the literature. To enable reproducible and efficient analysis of HILIC-IM-MS lipidomics data, we developed an open-source Python package, LiPydomics, which enables performing statistical and multivariate analyses (“stats” module), generating informative plots (“plotting” module), identifying lipid species at different confidence levels (“identification” module), and carrying out all functions using a user-friendly text-based interface (“interactive” module). To support lipid identification, we assembled a comprehensive experimental database of m/z and CCS of 45 lipid classes with 23 classes containing HILIC retention times. Prediction models for CCS and HILIC retention time for 22 and 23 lipid classes, respectively, were trained using the large experimental data set, which enabled the generation of a large predicted lipid database with 145,388 entries. Finally, we demonstrated the utility of the Python package using Staphylococcus aureus strains that are resistant to various antimicrobials.
Conventional strategies for drug metabolite identification employ a combination of liquid chromatography-mass spectrometry (LC-MS), which offers higher throughput but provides limited structural information, and nuclear magnetic resonance spectroscopy, which can achieve the most definitive identification but lacks throughput. Ion mobility-mass spectrometry (IM-MS) is a rapid, two-dimensional analysis that separates ions on the basis of their gas-phase size and shape (reflected by collision cross section, CCS) and their mass-to-charge (m/z) ratios, respectively. The rapid nature of IM separation combined with the structural information provided by CCS make IM-MS a promising technique for obtaining more structural information of drug metabolites without sacrificing analytical throughput. Here, we present an in vitro-biosynthesis coupled with IM-MS strategy for rapid generation and analysis of drug metabolites. Drug metabolites were generated in vitro using pooled subcellular fractions derived from human liver and analyzed using a rapid flow injection-IM-MS method. We measured CCS values for 19 parent drugs and their 37 metabolites generated in vitro (78 values in total), representing a wide variety of metabolic modifications. Post-IM fragmentation and computational modeling were used to support metabolite identifications and explore the structural characteristics driving behaviors observed in IM separation. Overall, we found the effects of metabolic modifications on the gas-phase structures of the metabolites to be highly dependent upon the structural characteristics of the parent compounds and the specific position of the modification. This in vitro-biosynthesis coupled with rapid IM-MS analysis workflow represents a promising platform for rapid and highconfidence identification of drug metabolites, applicable at a large scale.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.