Yihsuan S. Tsai scite author profile

In the last two years, because of advances in protein separation and mass spectrometry, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples and identifying hundreds and even thousands of proteins. However, computational tools for database search of top-down spectra against protein databases are still in their infancy. We describe MS-Align؉, a fast algorithm for top-down protein identification based on spectral alignment that enables searches for unexpected post-translational modifications. We also propose a method for evaluating statistical significance of topdown protein identifications and further benchmark various software tools on two top-down data sets from Saccharomyces cerevisiae and Salmonella typhimurium. We demonstrate that MS-Align؉ significantly increases the number of identified spectra as compared with MASCOT and OMSSA on both data sets. Although MS-Align؉ and ProSightPC have similar performance on the Salmonella typhimurium data set, MS-Align؉ outperforms ProSightPC on the (more complex) Saccharomyces cerevisiae data set. Molecular & Cellular Proteomics 11: 10.1074/mcp.M111.008524, 1-13, 2012.In the past two decades, proteomics was dominated by bottom-up mass spectrometry that analyzes digested peptides rather than intact proteins. Bottom-up approaches, although powerful, do have limitations in analyzing protein species, e.g. various proteolytic forms of the same protein or various protein isoforms resulting from alternative splicing. Top-down mass spectrometry focuses on analyzing intact proteins and large peptides (1-10) and has advantages in localizing multiple post-translational modifications (PTMs) 1 in a coordinated fashion (e.g. combinatorial PTM code) and identifying multiple protein species (e.g. proteolytically processed protein species) (11). Until recently, most top-down studies were limited to single purified proteins (12-15). Topdown studies of protein mixtures were restricted by difficulties in separating and fragmenting intact proteins and a shortage of robust computational tools. In the last two years, because of advances in protein separation and top-down instrumentation, top-down mass spectrometry moved from analyzing single proteins to analyzing complex samples containing hundreds and even thousands of proteins (16 -21). Because algorithms for interpreting topdown spectra are still in their infancy, many recent developments include computational innovations in protein identification.

show abstract

Deciphering diatom biochemical pathways via whole-cell proteomics

Nunn¹,

Aker²,

Shaffer³

et al. 2009

Aquat. Microb. Ecol.

View full text Add to dashboard Cite

Diatoms play a critical role in the oceans' carbon and silicon cycles; however, a mechanistic understanding of the biochemical processes that contribute to their ecological success remains elusive. Completion of the Thalassiosira pseudonana genome provided 'blueprints' for the potential biochemical machinery of diatoms, but offers only a limited insight into their biology under various environmental conditions. Using high-throughput shotgun proteomics, we identified a total of 1928 proteins expressed by T. pseudonana cultured under optimal growth conditions, enabling us to analyze this diatom's primary metabolic and biosynthetic pathways. Of the proteins identified, 70% are involved in cellular metabolism, while 11% are involved in the transport of molecules. We identified all of the enzymes involved in the urea cycle, thereby presenting a complete pathway to convert ammonia to urea, along with urea transporters, and the urea-degrading enzyme urease. Although metabolic exchange between these pathways remains ambiguous, their constitutive presence suggests complex intracellular nitrogen recycling. In addition, all C 4 -related enzymes for carbon fixation have been identified to be in abundance, with high protein sequence coverage. Quantification of mass spectra acquisitions demonstrated that the 20 most abundant proteins included an unexpectedly high expression of clathrin, which is the primary structural protein involved in endocytic transport. This result highlights a previously overlooked mechanism for the inter-and intra-cellular transport of nutrients and macromolecules in diatoms, potentially providing a missing link to organelle communication and metabolite exchange. Our results demonstrate the power of proteomics, and lay the groundwork for future comparative proteomic studies and directed analyses of specifically expressed proteins and biochemical pathways of oceanic diatoms.

show abstract

Landscape of the SOX2 protein–protein interactome

Fang

Yoon

et al. 2011

Proteomics

View full text Add to dashboard Cite

SOX2 is a key gene implicated in maintaining the stemness of embryonic and adult stem cells that appears to re-activate in several human cancers including glioblastoma multiforme. Using immunoprecipitation (IP)/MS/MS, we identified 144 proteins that are putative SOX2 interacting proteins. Of note, SOX2 was found to interact with several heterogeneous nuclear ribonucleoprotein family proteins, including HNRNPA2B1, HNRNPA3, HNRNPC, HNRNPK, HNRNPL, HNRNPM, HNRNPR, HNRNPU, as well as other ribonucleoproteins, DNA repair proteins and helicases. Gene ontology (GO) analysis revealed that the SOX2 interactome was enriched for GO terms GO:0030529 ribonucleoprotein complex and GO:0004386 helicase activity. These findings indicate that SOX2 associates with the heterogeneous nuclear ribonucleoprotein complex, suggesting a possible role for SOX2 in post-transcriptional regulation in addition to its function as a transcription factor.

show abstract

Precursor ion independent algorithm for top-down shotgun proteomics

Tsai

Scherl

Shaw

et al. 2009

J. Am. Soc. Mass Spectrom.

View full text Add to dashboard Cite

We present a precursor ion independent top-down algorithm (PIITA) for use in automated assignment of protein identifications from tandem mass spectra of whole proteins. To acquire the data, we utilize data-dependent acquisition to select protein precursor ions eluting from a C4-based HPLC column for collision induced dissociation in the linear ion trap of an LTQ-Orbitrap mass spectrometer. Gas-phase fractionation is used to increase the number of acquired tandem mass spectra, all of which are recorded in the Orbitrap mass analyzer. To identify proteins, the PIITA algorithm compares deconvoluted, deisotoped, observed tandem mass spectra to all possible theoretical tandem mass spectra for each protein in a genomic sequence database without regard for measured parent ion mass. Only after a protein is identified, is any difference in measured and theoretical precursor mass used to identify and locate post-translation modifications. We demonstrate the application of PIITA to data generated via our wet-lab approach on a Salmonella typhimurium outer membrane extract and compare these results to bottom-up analysis. From these data, we identify 154 proteins by top-down analysis, 73 of which were not identified in a parallel bottom-up analysis. We also identify 201 unique isoforms of these 154 proteins at a false discovery rate (FDR) of Ͻ1%. (J Am Soc Mass

show abstract

Increasing information from shotgun proteomic data by accounting for misassigned precursor ion masses

et al. 2008

View full text Add to dashboard Cite

Although mass spectrometers are capable of providing high mass accuracy data, assignment of true monoisotopic precursor ion mass is complicated during data-dependent ion selection for LC-MS/MS analysis of complex mixtures. The complication arises when chromatographic peak widths for a given analyte exceed the time required to acquire a precursor ion mass spectrum. The result is that many measured monoisotopic masses are misassigned due to calculation from a single mass spectrum with poor ion statistics based on only a fraction of the total available ions for a given analyte. Such data in turn produces errors in automated database searches, where precursor m/z value is one search parameter. We propose here a postacquisition approach to correct misassigned monoisotopic m/z values that involves peak detection over the entire elution profile and correction of the precursor ion monoisotopic mass. As a result of using this approach to reprocess shotgun proteomic data we increased peptide sequence assignments by 10% while reducing the estimated false positive ratio from 1 to 0.2%. We also show that 4% of the salvaged identifications may be accounted for by correction of mixed tandem mass spectra resulting from fragmentation of multiple peptides simultaneously, a situation which we refer to as accidental CID.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yihsuan S. Tsai

Protein Identification Using Top-Down Spectra

Deciphering diatom biochemical pathways via whole-cell proteomics

Landscape of the SOX2 protein–protein interactome

Precursor ion independent algorithm for top-down shotgun proteomics

Increasing information from shotgun proteomic data by accounting for misassigned precursor ion masses

Contact Info

Product

Resources

About