Details of the algorithms are described in supporting information available from the Halmstad University website www.hh.se/staff/bioinf/
In order to maximize protein identification by peptide mass fingerprinting noise peaks must be removed from spectra and recalibration is often required. The preprocessing of the spectra before database searching is essential but is time-consuming. Nevertheless, the optimal database search parameters often vary over a batch of samples. For high-throughput protein identification, these factors should be set automatically, with no or little human intervention. In the present work automated batch filtering and recalibration using a statistical filter is described. The filter is combined with multiple data searches that are performed automatically. We show that, using several hundred protein digests, protein identification rates could be more than doubled, compared to standard database searching. Furthermore, automated large-scale in-gel digestion of proteins with endoproteinase LysC, and matrix-assisted laser desorption/ionization-time of flight (MALDI-TOF) analysis, followed by subsequent trypsin digestion and MALDI-TOF analysis were performed. Several proteins could be identified only after digestion with one of the enzymes, and some less significant protein identifications were confirmed after digestion with the other enzyme. The results indicate that identification of especially small and low-abundance proteins could be significantly improved after sequential digestions with two enzymes.
An automated peak picking strategy is presented where several peak sets with different signal-tonoise levels are combined to form a more reliable statement on the protein identity. The strategy is compared against both manual peak picking and industry standard automated peak picking on a set of mass spectra obtained after tryptic in gel digestion of 2D-gel samples from human fetal fibroblasts. The set of spectra contain samples ranging from strong to weak spectra, and the proposed multiplescale method is shown to be much better on weak spectra than the industry standard method and a human operator, and equal in performance to these on strong and medium strong spectra. It is also demonstrated that peak sets selected by a human operator display a considerable variability and that it is impossible to speak of a single "true" peak set for a given spectrum. The described multiplescale strategy both avoids time-consuming parameter tuning and exceeds the human operator in protein identification efficiency. The strategy therefore promises reliable automated userindependent protein identification using peptide mass fingerprints.* Corresponding author, email: denni@ide.hh.se, phone +46-35 16 74 77, fax: +46-35 12 03 48
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.