Pollen, the male gametophyte of flowering plants, represents an ideal biological system to study developmental processes, such as cell polarity, tip growth, and morphogenesis. Upon hydration, the metabolically quiescent pollen rapidly switches to an active state, exhibiting extremely fast growth. This rapid switch requires relevant proteins to be stored in the mature pollen, where they have to retain functionality in a desiccated environment. Using a shotgun proteomics approach, we unambiguously identified ∼3500 proteins in Arabidopsis pollen, including 537 proteins that were not identified in genetic or transcriptomic studies. To generate this comprehensive reference data set, which extends the previously reported pollen proteome by a factor of 13, we developed a novel deterministic peptide classification scheme for protein inference. This generally applicable approach considers the gene model–protein sequence–protein accession relationships. It allowed us to classify and eliminate ambiguities inherently associated with any shotgun proteomics data set, to report a conservative list of protein identifications, and to seamlessly integrate data from previous transcriptomics studies. Manual validation of proteins unambiguously identified by a single, information-rich peptide enabled us to significantly reduce the false discovery rate, while keeping valuable identifications of shorter and lower abundant proteins. Bioinformatic analyses revealed a higher stability of pollen proteins compared to those of other tissues and implied a protein family of previously unknown function in vesicle trafficking. Interestingly, the pollen proteome is most similar to that of seeds, indicating physiological similarities between these developmentally distinct tissues.
A new study reveals a functional rule for N-terminal acetylation in higher eukaryotes called the (X)PX rule and describes a generic method that prevents this modification to allow the study of N-terminal acetylation in any given protein.
Bradyrhizobium japonicum, a gram-negative soil bacterium that establishes an N(2)-fixing symbiosis with its legume host soybean (Glycine max), has been used as a symbiosis model system. Using a sensitive geLC-MS/MS proteomics approach, we report the identification of 2315 B. japonicum strain USDA110 proteins (27.8% of the theoretical proteome) that are expressed 21 days post infection in symbiosis with soybean cultivated in growth chambers, substantially expanding the previously known symbiosis proteome. Integration of transcriptomics data generated under the same conditions (2780 expressed genes) allowed us to compile a comprehensive expression profile of B. japonicum during soybean symbiosis, which comprises 3587 genes/proteins (43% of the predicted B. japonicum genes/proteins). Analysis of this data set revealed both the biases and the complementarity of these global profiling technologies. A functional classification and pathway analysis showed that most of the proteins involved in carbon and nitrogen metabolism are expressed, including a complete set of tricarboxylic acid cycle enzymes, several gluconeogenesis and pentose phosphate pathway enzymes, as well as several proteins that were previously not considered to be present during symbiosis. Congruent results were obtained for B. japonicum bacteroids harvested from soybeans grown under field conditions.
Proteomes, the ensembles of all proteins expressed by cells or tissues, are typically analysed by mass spectrometry. Recent technical and computational advances have greatly increased the fraction of a proteome that can be identified and quantified in a single study. Current mass spectrometry-based proteomic strategies have the potential to reproducibly, accurately, quantitatively and comprehensively measure any protein or whole proteomes from cells and tissues at different states. Achieving these goals will require complete proteome maps and analytical strategies that use these maps as prior information and will greatly enhance the impact of proteomics on biological and clinical research.
Drosophila melanogaster is emerging as a powerful model system for the study of cardiac disease. Establishing peptide and protein maps of the Drosophila heart is central to implementation of protein network studies that will allow us to assess the hallmarks of Drosophila heart pathogenesis and gauge the degree of conservation with human disease mechanisms on a systems level. Using a gel-LC-MS/MS approach, we identified 1228 protein clusters from 145 dissected adult fly hearts. Contractile, cytostructural and mitochondrial proteins were most abundant consistent with electron micrographs of the Drosophila cardiac tube. Functional/Ontological enrichment analysis further showed that proteins involved in glycolysis, Ca2+-binding, redox, and G-protein signaling, among other processes, are also over-represented. Comparison with a mouse heart proteome revealed conservation at the level of molecular function, biological processes and cellular components. The subsisting peptidome encompassed 5169 distinct heart-associated peptides, of which 1293 (25%) had not been identified in a recent Drosophila peptide compendium. PeptideClassifier analysis was further used to map peptides to specific gene-models. 1872 peptides provide valuable information about protein isoform groups whereas a further 3112 uniquely identify specific protein isoforms and may be used as a heart-associated peptide resource for quantitative proteomic approaches based on multiple-reaction monitoring. In summary, identification of excitation-contraction protein landmarks, orthologues of proteins associated with cardiovascular defects, and conservation of protein ontologies, provides testimony to the heart-like character of the Drosophila cardiac tube and to the utility of proteomics as a complement to the power of genetics in this growing model of human heart disease.
One of the major goals of proteomics is the comprehensive and accurate description of a proteome. Shotgun proteomics, the method of choice for the analysis of complex protein mixtures, requires that experimentally observed peptides are mapped back to the proteins they were derived from. This process is also known as protein inference. We present Markovian Inference of Proteins and Gene Models (MIPGEM), a statistical model based on clearly stated assumptions to address the problem of protein and gene model inference for shotgun proteomics data. In particular, we are dealing with dependencies among peptides and proteins using a Markovian assumption on k-partite graphs. We are also addressing the problems of shared peptides and ambiguous proteins by scoring the encoding gene models. Empirical results on two control datasets with synthetic mixtures of proteins and on complex protein samples of Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana suggest that the results with MIPGEM are competitive with existing tools for protein inference. P roteomics, the comprehensive and quantitative analysis of proteins that are expressed in a given organ, tissue, or cell line, provides unique insights into biological systems that cannot be provided by genomics or transcriptomics approaches (1).With the advent of shotgun proteomics [gel-free liquid chromatography tandem mass spectrometry (LC-MS/MS)] (2), the number of distinct proteins that could be identified from complex samples has significantly increased compared to more traditional gel-based approaches. Shotgun proteomics has become the method of choice for the analysis of complex protein mixtures (1). Briefly, proteins are extracted from their biological source and enzymatically digested into peptides (usually using trypsin). The peptides are then separated by liquid chromatography and analyzed by tandem mass spectrometry. Peptides are thus the elementary unit of measure in LC-MS/MS (from now on, we assume that protein implies protein sequence and peptide implies peptide sequence).In this paper, we focus on a probabilistic model to address the problem of protein inference. The peptide identifications, i.e., the (posterior) probabilities that a given peptide is present in a sample of interest (or a corresponding discriminant score) are the input for our statistical model and algorithm for inferring posterior probabilities that individual proteins are present in the sample. As one important difference to previous solutions, the Markovian Inference of Proteins and Gene Models (MIPGEM) also allows to infer the presence or absence of gene models instead of being restricted to proteins. This is a useful extension for the integration of proteomics and transcriptomics data.Earlier proposals for protein inference models include refs. 3-14. A brief description of some of these methods can be found in ref. 11.The main elements characterizing our approach include the following: (i) We take uncertainties related to the peptide-spectrum matching process into accou...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.