METLIN originated as a database to characterize known metabolites and has since expanded into a technology platform for the identification of known and unknown metabolites and other chemical entities. Through this effort it has become a comprehensive resource containing over 1 million molecules including lipids, amino acids, carbohydrates, toxins, small peptides, and natural products, among other classes. METLIN’s high-resolution tandem mass spectrometry (MS/MS) database, which plays a key role in the identification process, has data generated from both reference standards and their labeled stable isotope analogues, facilitated by METLIN-guided analysis of isotope-labeled microorganisms. The MS/MS data, coupled with the fragment similarity search function, expand the tool’s capabilities into the identification of unknowns. Fragment similarity search is performed independent of the precursor mass, relying solely on the fragment ions to identify similar structures within the database. Stable isotope data also facilitate characterization by coupling the similarity search output with the isotopic m/z shifts. Examples of both are demonstrated here with the characterization of four previously unknown metabolites. METLIN also now features in silico MS/MS data, which has been made possible through the creation of algorithms trained on METLIN’s MS/MS data from both standards and their isotope analogues. With these informatic and experimental data features, METLIN is being designed to address the characterization of known and unknown molecules.
Metabolomics, in which small-molecule metabolites (the metabolome) are identified and quantified, is broadly acknowledged to be the omics discipline that is closest to the phenotype1–3. Although appreciated for its role in biomarker discovery programs, metabolomics can also be used to identify metabolites that could alter a cell’s or an organism’s phenotype. Metabolomics activity screening (MAS) as described here integrates metabolomics data with metabolic pathways and systems biology information, including proteomics and transcriptomics data, to produce a set of endogenous metabolites that can be tested for functionality in altering phenotypes. A growing literature reports the use of metabolites to modulate diverse processes, such as stem cell differentiation, oligodendrocyte maturation, insulin signaling, T-cell survival and macrophage immune responses. This opens up the possibility of identifying and applying metabolites to affect phenotypes. Unlike genes or proteins, metabolites are often readily available, which means that MAS is broadly amenable to high-throughput screening of virtually any biological system.
Machine learning has been extensively applied in small molecule analysis to predict a wide range of molecular properties and processes including mass spectrometry fragmentation or chromatographic retention time. However, current approaches for retention time prediction lack sufficient accuracy due to limited available experimental data. Here we introduce the METLIN small molecule retention time (SMRT) dataset, an experimentally acquired reverse-phase chromatography retention time dataset covering up to 80,038 small molecules. To demonstrate the utility of this dataset, we deployed a deep learning model for retention time prediction applied to small molecule annotation. Results showed that in 70 of the cases, the correct molecular identity was ranked among the top 3 candidates based on their predicted retention time. We anticipate that this dataset will enable the community to apply machine learning or first principles strategies to generate better models for retention time prediction.
Metabolite identification is still considered an imposing bottleneck in liquid chromatography mass spectrometry (LC/MS) untargeted metabolomics. The identification workflow usually begins with detecting relevant LC/MS peaks via peak-picking algorithms and retrieving putative identities based on accurate mass searching. However, accurate mass search alone provides poor evidence for metabolite identification. For this reason, computational annotation is used to reveal the underlying metabolites monoisotopic masses, improving putative identification in addition to confirmation with tandem mass spectrometry. This review examines LC/MS data from a computational and analytical perspective, focusing on the occurrence of neutral losses and in-source fragments, to understand the challenges in computational annotation methodologies. Herein, we examine the state-of-the-art strategies for computational annotation including: (i) peak grouping or full scan (MS1) pseudo-spectra extraction, i.e., clustering all mass spectral signals stemming from each metabolite; (ii) annotation using ion adduction and mass distance among ion peaks; (iii) incorporation of biological knowledge such as biotransformations or pathways; (iv) tandem MS data; and (v) metabolite retention time calibration, usually achieved by prediction from molecular descriptors. Advantages and pitfalls of each of these strategies are discussed, as well as expected future trends in computational annotation.
Heme is an essential prosthetic group of numerous proteins and a central signaling molecule in many physiologic processes 1,2. The chemical reactivity of heme requires that a network of intracellular chaperone proteins exist to avert the cytotoxic effects of free heme, but the constituents of such trafficking pathways are unknown 3,4. Heme synthesis is completed in mitochondria, with ferrochelatase (FECH) adding iron to protoporphyrin IX. How this vital but Reprints and permissions information is available at http://www.nature.com/reprints.Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use:
Computational metabolite annotation in untargeted profiling aims at uncovering neutral molecular masses of underlying metabolites and assign those with putative identities. Existing annotation strategies rely on the observation and annotation of adducts to determine metabolite neutral masses. However, a significant fraction of features usually detected in untargeted experiments remains unannotated, which limits our ability to determine neutral molecular masses. Despite the availability of tools to annotate, relatively few of them benefit from the inherent presence of in-source fragments in liquid chromatography-electrospray ionization-mass spectrometry. In this study, we introduce a strategy to annotate in-source fragments in untargeted data using low energy tandem MS spectra from the METLIN library. Our algorithm, MISA (METLIN-guided in-source annotation), compares detected features against low energy fragments from MS/MS spectra, enabling robust annotation and putative identification of metabolic features based on low energy spectral matching, The algorithm was evaluated through an annotation analysis of a total 140 metabolites across three different sets of biological samples analyzed with liquid chromatography-mass spectrometry. Results showed that in cases where adducts were not formed or detected, MISA was able to uncover neutral molecular masses by in-source fragment matching. MISA was also able to provide putative metabolite identities via two annotation scores, These scores take into account the number of in-source fragments matched and the relative intensity similarity between the experimental data and the reference low energy MS/MS spectra. Overall, results showed that in-source fragmentation is a highly frequent phenomena that should be considered for comprehensive feature annotation, Thus, combined with adduct annotation, this strategy adds a complementary annotation layer, enabling in-source fragments to be annotated and increasing putative identification confidence. The algorithm is integrated into the XCMS Online platform and is freely available at http://xcmsonline.scripps.edu.
We report XCMS-MRM and METLIN-MRM ( http://xcmsonline-mrm.scripps.edu/ and http://metlin.scripps.edu/ ), a cloud-based data-analysis platform and a public multiple-reaction monitoring (MRM) transition repository for small-molecule quantitative tandem mass spectrometry. This platform provides MRM transitions for more than 15,500 molecules and facilitates data sharing across different instruments and laboratories.
Endogenous metabolites play essential roles in the regulation of cellular identity and activity. Here we have investigated the process of oligodendrocyte precursor cell (OPC) differentiation, a process that becomes limiting during progressive stages of demyelinating diseases, including multiple sclerosis, using mass-spectrometry-based metabolomics. Levels of taurine, an aminosulfonic acid possessing pleotropic biological activities and broad tissue distribution properties, were found to be significantly elevated (~20-fold) during the course of oligodendrocyte differentiation and maturation. When added exogenously at physiologically relevant concentrations, taurine was found to dramatically enhance the processes of drug-induced in vitro OPC differentiation and maturation. Mechanism of action studies suggest that the oligodendrocyte-differentiation-enhancing activities of taurine are driven primarily by its ability to directly increase available serine pools, which serve as the initial building block required for the synthesis of the glycosphingolipid components of myelin that define the functional oligodendrocyte cell state.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.