BackgroundDetection of low abundance metabolites is important for de novo mapping of metabolic pathways related to diet, microbiome or environmental exposures. Multiple algorithms are available to extract m/z features from liquid chromatography-mass spectral data in a conservative manner, which tends to preclude detection of low abundance chemicals and chemicals found in small subsets of samples. The present study provides software to enhance such algorithms for feature detection, quality assessment, and annotation.ResultsxMSanalyzer is a set of utilities for automated processing of metabolomics data. The utilites can be classified into four main modules to: 1) improve feature detection for replicate analyses by systematic re-extraction with multiple parameter settings and data merger to optimize the balance between sensitivity and reliability, 2) evaluate sample quality and feature consistency, 3) detect feature overlap between datasets, and 4) characterize high-resolution m/z matches to small molecule metabolites and biological pathways using multiple chemical databases. The package was tested with plasma samples and shown to more than double the number of features extracted while improving quantitative reliability of detection. MS/MS analysis of a random subset of peaks that were exclusively detected using xMSanalyzer confirmed that the optimization scheme improves detection of real metabolites.ConclusionsxMSanalyzer is a package of utilities for data extraction, quality control assessment, detection of overlapping and unique metabolites in multiple datasets, and batch annotation of metabolites. The program was designed to integrate with existing packages such as apLCMS and XCMS, but the framework can also be used to enhance data extraction for other LC/MS data software.
The exposome is the cumulative measure of environmental influences and associated biological responses throughout the lifespan, including exposures from the environment, diet, behavior, and endogenous processes. A major challenge for exposome research lies in the development of robust and affordable analytic procedures to measure the broad range of exposures and associated biologic impacts occurring over a lifetime. Biomonitoring is an established approach to evaluate internal body burden of environmental exposures, but use of biomonitoring for exposome research is often limited by the high costs associated with quantification of individual chemicals. High-resolution metabolomics (HRM) uses ultra-high resolution mass spectrometry with minimal sample preparation to support high-throughput relative quantification of thousands of environmental, dietary, and microbial chemicals. HRM also measures metabolites in most endogenous metabolic pathways, thereby providing simultaneous measurement of biologic responses to environmental exposures. The present research examined quantification strategies to enhance the usefulness of HRM data for cumulative exposome research. The results provide a simple reference standardization protocol in which individual chemical concentrations in unknown samples are estimated by comparison to a concurrently analyzed, pooled reference sample with known chemical concentrations. The approach was tested using blinded analyses of amino acids in human samples and was found to be comparable to independent laboratory results based on surrogate standardization or internal standardization. Quantification was reproducible over a 13-month period and extrapolated to thousands of chemicals. The results show that reference standardization protocol provides an effective strategy that will enhance data collection for cumulative exposome research. In principle, the approach can be extended to other types of mass spectrometry and other analytical methods.
Improved analytical technologies and data extraction algorithms enable detection of >10,000 reproducible signals by liquid chromatography high-resolution mass spectrometry, creating a bottleneck in chemical identification. In principle, measurement of more than one million chemicals would be possible if algorithms were available to facilitate utilization of the raw mass spectrometry data, especially low abundance metabolites. Here we describe an automated computational framework to annotate ions for possible chemical identity using a multistage clustering algorithm in which metabolic pathway associations are used along with intensity profiles, retention time characteristics, mass defect, and isotope/adduct patterns. The algorithm uses high-resolution mass spectrometry data for a series of samples with common properties and publicly available chemical, metabolic and environmental databases to assign confidence levels to annotation results. Evaluation results show that the algorithm achieves an F1-measure of 0.8 for a dataset with known targets and is more robust than previously reported results for cases when database size is much greater than the actual number of metabolites. MS/MS evaluation of a set of randomly selected 210 metabolites annotated using xMSannotator in an untargeted metabolomics human dataset shows that 80% of features with high or medium confidence scores have ion dissociation patterns consistent with the xMSannotator annotation. The algorithm has been incorporated into an R package, xMSannotator, which includes utilities for querying local or online databases such as ChemSpider, KEGG, HMDB, T3DB, and LipidMaps.
Various databases have harnessed the wealth of publicly available microarray data to address biological questions ranging from across-tissue differential expression to homologous gene expression. Despite their practical value, these databases rely on relative measures of expression and are unable to address the most fundamental question—which genes are expressed in a given cell type. The Gene Expression Barcode is the first database to provide reliable absolute measures of expression for most annotated genes for 131 human and 89 mouse tissue types, including diseased tissue. This is made possible by a novel algorithm that leverages information from the GEO and ArrayExpress public repositories to build statistical models that permit converting data from a single microarray into expressed/unexpressed calls for each gene. For selected platforms, users may upload data and obtain results in a matter of seconds. The raw data, curated annotation, and code used to create our resource are also available at http://rafalab.jhsph.edu/barcode.
“Sola dosis facit venenum.” These words of Paracelsus, “the dose makes the poison”, can lead to a cavalier attitude concerning potential toxicities of the vast array of low abundance environmental chemicals to which humans are exposed. Exposome research teaches that 80–85% of human disease is linked to environmental exposures. The human exposome is estimated to include >400,000 environmental chemicals, most of which are uncharacterized with regard to human health. In fact, mass spectrometry measures >200,000 m/z features (ions) in microliter volumes derived from human samples; most are unidentified. This crystallizes a grand challenge for chemical research in toxicology: to develop reliable and affordable analytical methods to understand health impacts of the extensive human chemical experience. To this end, there appears to be no choice but to abandon the limitations of measuring one chemical at a time. The present review looks at progress in computational metabolomics to provide probability based annotation linking ions to known chemicals and serve as a foundation for unambiguous designation of unidentified ions for toxicologic study. We review methods to characterize ions in terms of accurate mass m/z, chromatographic retention time, correlation of adduct, isotopic and fragment forms, association with metabolic pathways and measurement of collision-induced dissociation products, collision cross section, and chirality. Such information can support a largely unambiguous system for documenting unidentified ions in environmental surveillance and human biomonitoring. Assembly of this data would provide a resource to characterize and understand health risks of the array of low-abundance chemicals to which humans are exposed.
H1 linker histones facilitate higher-order chromatin folding and are essential for mammalian development. To achieve high-resolution mapping of H1 variants H1d and H1c in embryonic stem cells (ESCs), we have established a knock-in system and shown that the N-terminally tagged H1 proteins are functionally interchangeable to their endogenous counterparts in vivo. H1d and H1c are depleted from GC- and gene-rich regions and active promoters, inversely correlated with H3K4me3, but positively correlated with H3K9me3 and associated with characteristic sequence features. Surprisingly, both H1d and H1c are significantly enriched at major satellites, which display increased nucleosome spacing compared with bulk chromatin. While also depleted at active promoters and enriched at major satellites, overexpressed H10 displays differential binding patterns in specific repetitive sequences compared with H1d and H1c. Depletion of H1c, H1d, and H1e causes pericentric chromocenter clustering and de-repression of major satellites. These results integrate the localization of an understudied type of chromatin proteins, namely the H1 variants, into the epigenome map of mouse ESCs, and we identify significant changes at pericentric heterochromatin upon depletion of this epigenetic mark.
Background Population based investigations suggest that red blood cells (RBCs) are therapeutically effective when collected, processed and stored for up to 42 days under validated conditions prior to transfusion. However, some retrospective clinical studies have shown worse patient outcomes when transfused RBCs have been stored for the longest times. Furthermore, studies of RBC persistence in the circulation after transfusion have suggested that considerable donor-to-donor variability exists, and may affect transfusion efficacy. To understand the limitations of current blood storage technologies and to develop approaches to improve RBC storage and transfusion efficacy, we investigated the global metabolic alterations that occur when RBCs are stored in AS-1 (AS1-RBC). Methods Leukoreduced AS1-RBC units prepared from 9 volunteer research donors (12 total donated units) were serially sampled for metabolomics analysis over 42 days of refrigerated storage. Samples were tested by GC/MS and LC/MS/MS, and specific biochemical compounds were identified by comparison to a library of purified standards. Results Over three experiments, 185–264 defined metabolites were quantified in stored RBC samples. Kinetic changes in these biochemicals confirmed known alterations in glycolysis and other pathways previously identified in RBCs stored in SAGM (SAGM-RBC). Furthermore, we identified additional alterations not previously seen in SAGM-RBCs (e.g., stable pentose phosphate pathway flux, progressive decreases in oxidized glutathione), and we delineated changes occurring in other metabolic pathways not previously studied (e.g., S-adenosyl methionine cycle). These data are presented in the context of a detailed comparison with previous studies of SAGM-RBCs from human donors and murine AS1-RBCs. Conclusion Global metabolic profiling of AS1-RBCs revealed a number of biochemical alterations in stored blood that may affect RBC viability during storage as well as therapeutic effectiveness of stored RBCs in transfusion recipients. Significance These results provide future opportunities to more clearly pinpoint the metabolic defects during RBC storage, to identify biomarkers for donor screening and prerelease RBC testing, and to develop improved RBC storage solutions and methodologies.
We aimed to characterize metabolites during tuberculosis (TB) disease and identify new pathophysiologic pathways involved in infection as well as biomarkers of TB onset, progression and resolution. Such data may inform development of new anti-tuberculosis drugs. Plasma samples from adults with newly diagnosed pulmonary TB disease and their matched, asymptomatic, sputum culture-negative household contacts were analyzed using liquid chromatography high-resolution mass spectrometry (LC-MS) to identify metabolites. Statistical and bioinformatics methods were used to select accurate mass/charge (m/z) ions that were significantly different between the two groups at a false discovery rate (FDR) of q<0.05. Two-way hierarchical cluster analysis (HCA) was used to identify clusters of ions contributing to separation of cases and controls, and metabolomics databases were used to match these ions to known metabolites. Identity of specific D-series resolvins, glutamate and Mycobacterium tuberculosis (Mtb)-derived trehalose-6-mycolate was confirmed using LC-MS/MS analysis. Over 23,000 metabolites were detected in untargeted metabolomic analysis and 61 metabolites were significantly different between the two groups. HCA revealed 8 metabolite clusters containing metabolites largely upregulated in patients with TB disease, including anti-TB drugs, glutamate, choline derivatives, Mycobacterium tuberculosis-derived cell wall glycolipids (trehalose-6-mycolate and phosphatidylinositol) and pro-resolving lipid mediators of inflammation, known to stimulate resolution, efferocytosis and microbial killing. The resolvins were confirmed to be RvD1, aspirin-triggered RvD1, and RvD2. This study shows that high-resolution metabolomic analysis can differentiate patients with active TB disease from their asymptomatic household contacts. Specific metabolites upregulated in the plasma of patients with active TB disease, including Mtb-derived glycolipids and resolvins, have potential as biomarkers and may reveal pathways involved in TB disease pathogenesis and resolution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2023 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.