Chiral organosilanes are of great value in asymmetric synthesis, functional materials, and medicinal chemistry. Compared with single-silyl compounds, bis(silyl) ones are understudied because of the lack of the efficient synthetic protocols. The development of efficient synthetic approaches to access bis(silyl) compounds is highly desirable for studying their basic properties and potential utilities. Here, a cobalt-catalyzed sequential double hydrosilylation of aliphatic alkynes was developed to synthesize highly enantioenriched gem-bis(silyl)alkanes. This protocol used simple aliphatic alkynes and silanes to construct valuable chiral gem-bis(silyl)alkanes. The control experiments, isotopic labeling experiments, kinetic studies, and density functional theory calculations were conducted to elucidate the reaction mechanism. The synthetic versatility of gem-bis(silyl)alkanes was demonstrated by the synthesis of chiral organosilanols, a-hydroxysilanes through selective C-Si bond transformation and hydrosilylation of alkynes to construct chiral silanes containing adjacent C-stereocenter and Si-stereocenter.
Existing data acquisition modes such as full-scan, data-dependent (DDA), and data-independent acquisition (DIA) often present limited capabilities in capturing metabolic information in liquid chromatography-mass spectrometry (LC-MS)-based metabolomics. In this work, we proposed a novel metabolomic data acquisition workflow that combines DDA and DIA analyses to achieve better metabolomic data quality, including enhanced metabolome coverage, tandem mass spectrometry (MS2) coverage, and MS2 quality. This workflow, named data-dependent-assisted data-independent acquisition (DaDIA), performs untargeted metabolomic analysis of individual biological samples using DIA mode and the pooled quality control (QC) samples using DDA mode. This combination takes advantage of the high-feature number and MS2 spectral coverage of the DIA data and the high MS2 spectral quality of the DDA data. To analyze the heterogeneous DDA and DIA data, we further developed a computational program, DaDIA.R, to automatically extract metabolic features and perform streamlined metabolite annotation of DaDIA data set. Using human urine samples, we demonstrated that the DaDIA workflow delivers remarkably improved data quality when compared to conventional DDA or DIA metabolomics. In particular, both the number of detected features and annotated metabolites were greatly increased. Further biological demonstration using a leukemia metabolomics study also proved that the DaDIA workflow can efficiently detect and annotate around 4 times more significant metabolites than DDA workflow with broad MS2 coverage and high MS2 spectral quality for downstream statistical analysis and biological interpretation. Overall, this work represents a critical development of data acquisition mode in untargeted metabolomics, which can greatly benefit untargeted metabolomics for a wide range of biological applications.
Spectral similarity comparison through tandem mass spectrometry (MS2) is a powerful approach to annotate known and unknown metabolic features in mass spectrometry (MS)-based untargeted metabolomics. In this work, we proposed the concept of hypothetical neutral loss (HNL), which is the mass difference between a pair of fragment ions in a MS2 spectrum. We demonstrated that HNL values contain core structural information that can be used to accurately assess the structural similarity between two MS2 spectra. We then developed the Core Structure-based Search (CSS) algorithm based on HNL values. CSS was validated with sets of hundreds of randomly selected metabolites and their reference MS2 spectra, showing significantly improved correlation between spectral and structural similarities. Compared to state-of-the-art spectral similarity algorithms, CSS generates better ranking of structurally relevant chemicals among false positives. Combining CSS, HNL library, and biotransformation database, we further developed Metabolite core structure-based Search (McSearch), a novel computational solution to facilitate the annotation of unknown metabolites using the reference MS2 spectra of their structural analogs. McSearch generates better results in the Critical Assessment of Small Molecule Identification (CASMI) 2017 data set than conventional unknown feature annotation programs. McSearch was also tested in experimental MS2 data of xenobiotic metabolite derivatives belonging to three different metabolic pathways. Our results confirmed that McSearch can better capture the underlying structural similarity between MS2 spectra. Overall, this work provides a novel direction for metabolite annotation via HNL values, paving the way for annotating metabolites using their structurally similar compounds.
In-source fragmentation (ISF) is a naturally occurring phenomenon during electrospray ionization (ESI) in liquid chromatography−mass spectrometry (LC-MS) analysis. ISF leads to false metabolite annotation in untargeted metabolomics, prompting misinterpretation of the underlying biological mechanisms. Conventional metabolomic data cleaning mainly focuses on the annotation of adducts and isotopes, and the recognition of ISF features is mainly based on common neutral losses and the LC coelution pattern. In this work, we recognized three increasingly important patterns of ISF features, including (1) coeluting with their precursor ions, (2) being in the tandem MS (MS 2 ) spectra of their precursor ions, and (3) sharing similar MS 2 fragmentation patterns with their precursor ions. Based on these patterns, we developed an R package, ISFrag, to comprehensively recognize all possible ISF features from LC-MS data generated from full-scan, data-dependent acquisition, and data-independent acquisition modes without the assistance of common neutral loss information or MS 2 spectral library. Tested using metabolite standards, we achieved a 100% correct recognition of level 1 ISF features and over 80% correct recognition for level 2 ISF features. Further application of ISFrag on untargeted metabolomics data allows us to identify ISF features that can potentially cause false metabolite annotation at an omics-scale. With the help of ISFrag, we performed a systematic investigation of how ISF features are influenced by different MS parameters, including capillary voltage, end plate offset, ion energy, and "collision energy". Our results show that while increasing energies can increase the number of real metabolic features and ISF features, the percentage of ISF features might not necessarily increase. Finally, using ISFrag, we created an ISF pathway to visualize the relationships between multiple ISF features that belong to the same precursor ion. ISFrag is freely available on GitHub (https://github.com/HuanLab/ISFrag).
The nonlinear signal response of electrospray ionization (ESI) presents a critical limitation for mass spectrometry (MS)-based quantitative analysis. In the field of metabolomics research, this issue has largely remained unaddressed; MS signal intensities are usually directly used to calculate fold changes for quantitative comparison. In this work, we demonstrate that, due to the nonlinear ESI response, signal intensity ratios of a metabolic feature calculated between two samples may not reflect their real metabolic concentration ratios (i.e., fold-change compression), implying that conventional fold-change calculations directly using MS signal intensities can be misleading. In this regard, we developed a quality control (QC) sample-based signal calibration workflow to overcome the quantitative bias caused by the nonlinear ESI response. In this workflow, calibration curves for every metabolic feature are first established using a QC sample injected in serial injection volumes. The MS signals of each metabolic feature are then calibrated to their equivalent QC injection volumes for comparative analysis. We demonstrated this novel workflow in a targeted metabolite analysis, showing that the accuracy of fold-change calculations can be significantly improved. Furthermore, in a metabolomic comparison of the bone marrow interstitial fluid samples from leukemia patients before and after chemotherapy, an additional 59 significant metabolic features were found with fold changes larger than 1.5, and an additional 97 significant metabolic features had fold changes corrected by more than 0.1. This work enables high-quality quantitative analysis in untargeted metabolomics, thus providing more confident biological hypotheses generation.
Tandem mass spectral (MS/MS) data in liquid chromatography–tandem mass spectrometry (LC-MS/MS) analysis are often contaminated as the selection of precursor ions is based on a low-resolution quadrupole mass filter. In this work, we developed a strategy to differentiate contamination fragment ions (CFIs) from true fragment ions (TFIs) in an MS/MS spectrum. The rationale is that TFIs should coelute with their parent ions, but CFIs should not. To assess coelution, we performed a parallel LC-MS/MS analysis in data-independent acquisition (DIA) with all-ion-fragmentation (AIF) mode. Using the DIA (AIF) data, peak–peak correlation (PPC) score is calculated between the extracted ion chromatogram (EIC) of the fragment ion using the MS/MS scans and the EIC of the precursor ion using the MS1 scans. A high PPC score is an indication of TFIs, and a low PPC score is an indication of CFIs. Tested using metabolomics data generated by high resolution QTOF and Orbitrap MS from various vendors in different LC-MS configurations, we found that more than 70% of the fragment ions have PPC scores < 0.8 and identified three common sources of CFIs, including (1) solvent contamination, (2) adjacent chemical contamination, and (3) undetermined signals from artifacts and noise. Combining PPC scores with other precursor and fragment ion information, we further developed a machine learning model that can robustly and conservatively predict CFIs. Incorporating the machine learning model, we created an R program, MS2Purifier, to automatically recognize CFIs and clean MS/MS spectra of metabolic features in LC-MS/MS data with high sensitivity and specificity.
Extracting metabolic features from liquid chromatography−mass spectrometry (LC-MS) data relies on the recognition of extracted ion chromatogram (EIC) peak shapes using peak picking algorithms. Unfortunately, all peak picking algorithms present a significant drawback of generating a problematic number of false positives. In this work, we take advantage of deep learning technology to develop a convolutional neural network (CNN)-based program that can automatically recognize metabolic features with poor EIC shapes, which are of low feature fidelity and more likely to be false. Our CNN model was trained using 25095 EIC plots collected from 22 LC-MS-based metabolomics projects of various sample types, LC and MS conditions. Notably, we manually inspected all the EIC plots to assign good or poor EIC quality for accurate model training. The trained CNN model is embedded into a C#-based program, named EVA (short for evaluation). The EVA Windows Application is a versatile platform that can process metabolic features generated by LC-MS systems of various vendors and processed using various data processing software. Our comprehensive evaluation of EVA indicates that it achieves over 90% classification accuracy. EVA can be readily used in LC-MS-based metabolomics projects and is freely available on the Microsoft Store by searching "EVA Metabolomics".
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.