Measuring precise concentrations of proteins can provide insights into biological processes. Here, we use efficient protein extraction and sample fractionation and state-of-the-art quantitative mass spectrometry techniques to generate a comprehensive, condition-dependent protein abundance map of Escherichia coli. We measure cellular protein concentrations for 55% of predicted E. coli genes (>2300 proteins) under 22 different experimental conditions and identify methylation and N-terminal protein acetylations previously not known to be prevalent in bacteria. We uncover system-wide proteome allocation, expression regulation, and post-translational adaptations. These data provide a valuable resource for the systems biology and broader E. coli research communities.
The complete and specific proteolytic cleavage of protein samples into peptides is crucial for the success of every shotgun LC-MS/MS experiment. In particular, popular peptide-based label-free and targeted mass spectrometry approaches rely on efficient generation of fully cleaved peptides to ensure accurate and sensitive protein quantification. In contrast to previous studies, we globally and quantitatively assessed the efficiency of different digestion strategies using a yeast cell lysate, label-free quantification, and statistical analysis. Digestion conditions include double tryptic, surfactant-assisted, and tandem-combinatorial Lys-C/trypsin digestion. In comparison to tryptic digests, Lys-C/trypsin digests were found most efficient to yield fully cleaved peptides while reducing the abundance of miscleaved peptides. Subsequent sequence context analysis revealed improved digestion performances of Lys-C/trypsin for miscleaved sequence stretches flanked by charged basic and particulary acidic residues. Furthermore, targeted MS analysis demonstrated a more comprehensive protein cleavage only after Lys-C/trypsin digestion, resulting in a more accurrate absolute protein quantification and extending the number of peptides suitable for SRM assay development. Therefore, we conclude that a serial Lys-C/trypsin digestion is highly attractive for most applications in quantitative MS-based proteomics building on in-solution digestion schemes.
Molecular diversity of surface receptors has been hypothesized to provide a mechanism for selective synaptic connectivity. Neurexins are highly diversified receptors that drive the morphological and functional differentiation of synapses. Using a single cDNA sequencing approach, we detected 1,364 unique neurexin-α and 37 neurexin-β mRNAs produced by alternative splicing of neurexin pre-mRNAs. This molecular diversity results from near-exhaustive combinatorial use of alternative splice insertions in Nrxn1α and Nrxn2α. By contrast, Nrxn3α exhibits several highly stereotyped exon selections that incorporate novel elements for posttranscriptional regulation of a subset of transcripts. Complexity of Nrxn1α repertoires correlates with the cellular complexity of neuronal tissues, and a specific subset of isoforms is enriched in a purified cell type. Our analysis defines the molecular diversity of a critical synaptic receptor and provides evidence that neurexin diversity is linked to cellular diversity in the nervous system.
There is a great interest in reliable ways to obtain absolute protein abundances at a proteome-wide scale. To this end, label-free LC-MS/MS quantification methods have been proposed where all identified proteins are assigned an estimated abundance. Several variants of this quantification approach have been presented, based on either the number of spectral counts per protein or MS1 peak intensities. Equipped with several datasets representing real biological environments, containing a high number of accurately quantified reference proteins, we evaluate five popular low-cost and easily implemented quantification methods (Absolute Protein Expression, Exponentially Modified Protein Abundance Index, Intensity-Based Absolute Quantification Index, Top3, and MeanInt). Our results demonstrate considerably improved abundance estimates upon implementing accurately quantified reference proteins; that is, using spiked in stable isotope labeled standard peptides or a standard protein mix, to generate a properly calibrated quantification model. We show that only the Top3 method is directly proportional to protein abundance over the full quantification range and is the preferred method in the absence of reference protein measurements. Additionally, we demonstrate that spectral count based quantification methods are associated with higher errors than MS1 peak intensity based methods. Furthermore, we investigate the impact of miscleaved, modified, and shared peptides as well as protein size and the number of employed reference proteins on quantification accuracy.
The multiplexing capabilities of isobaric mass tag-based protein quantification, such as Tandem Mass Tags or Isobaric Tag for Relative and Absolute Quantitation have dramatically increased the scope of mass spectrometry-based proteomics studies. Not only does the technology allow for the simultaneous quantification of multiple samples in a single MS injection, but its seamless compatibility with extensive sample prefractionation methods allows for comprehensive studies of complex proteomes. However, reporter ion-based quantification has often been criticized for limited quantification accuracy due to interference from coeluting peptides and peptide fragments. In this study, we investigate the extent of this problem and propose an effective and easy-to-implement remedy that relies on spiking a 6-protein calibration mixture to the samples. We evaluated our ratio adjustment approach using two large scale TMT 10-plex data sets derived from a human cancer and noncancer cell line as well as E. coli cells grown at two different conditions. Furthermore, we analyzed a complex 2-proteome artificial sample mixture and investigated the precision of TMT and precursor ion intensity-based label free quantification. Studying the protein set identified by both methods, we found that differentially abundant proteins were assigned dramatically higher statistical significance when quantified using TMT. Data are available via ProteomeXchange with identifier PXD003346.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.