Motivation The use of post-processing tools to maximize the information gained from a proteomics search engine is widely accepted and used by the community, with the most notable example being Percolator—a semi-supervised machine learning model which learns a new scoring function for a given dataset. The usage of such tools is however bound to the search engine’s scoring scheme, which doesn’t always make full use of the intensity information present in a spectrum. We aim to show how this tool can be applied in such a way that maximizes the use of spectrum intensity information by leveraging another machine learning-based tool, MS2PIP. MS2PIP predicts fragment ion peak intensities. Results We show how comparing predicted intensities to annotated experimental spectra by calculating direct similarity metrics provides enough information for a tool such as Percolator to accurately separate two classes of peptide-to-spectrum matches. This approach allows using more information out of the data (compared with simpler intensity based metrics, like peak counting or explained intensities summing) while maintaining control of statistics such as the false discovery rate. Availability and implementation All of the code is available online at https://github.com/compomics/ms2rescore. Supplementary information Supplementary data are available at Bioinformatics online.
When analyzing mass spectrometry imaging data sets, assigning a molecule to each of the thousands of generated images is a very complex task. Recent efforts have taken lessons from (tandem) mass spectrometry proteomics and applied them to imaging mass spectrometry metabolomics, with good results. Our goal is to go a step further in this direction and apply a well established, data-driven method to improve the results obtained from an annotation engine. By using a data-driven rescoring strategy, we are able to consistently improve the sensitivity of the annotation engine while maintaining control of statistics like estimated rate of false discoveries. All the code necessary to run a search and extract the additional features can be found at https://github.com/anasilviacs/sm-engine and to rescore the results from a search in https://github.com/anasilviacs/rescore-metabolites .
Mass spectrometers typically output data in proprietary binary formats. While converter suites and standardized XML formats have been developed in response, these conversion steps come with non-negligible computational time and storage space overhead. As a result, simple, everyday data inspection tasks are often beyond the skills of the mass spectrometrist, who is unable to freely access the acquired data. We therefore here describe the unthermo library for convenient, platform-independent access to Thermo Scientific RAW files and the associated online playground to transform small and easily understandable scriptlets into executable programs for end-users. By fostering the provision of code examples and snippet exchange, the interested mass spectrometrist or researcher can use this playground to quickly assemble custom scripts for their particular purpose. In this way, the data in these RAW files can be mined much more readily and directly by the user, and fast, automated raw data extraction or analysis can finally become part of the daily routine of the mass spectrometrist.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.