The availability of human genome sequence has transformed biomedical research over the past decade. However, an equivalent map for the human proteome with direct measurements of proteins and peptides does not exist yet. Here, we present a draft map of the human proteome using high resolution Fourier transform mass spectrometry. In-depth proteomic profiling of 30 histologically normal human samples including 17 adult tissues, 7 fetal tissues and 6 purified primary hematopoietic cells resulted in identification of proteins encoded by 17,294 genes accounting for ~84% of the total annotated protein-coding genes in humans. A unique and comprehensive strategy for proteogenomic analysis enabled us to discover a number of novel protein-coding regions, which includes translated pseudogenes, non-coding RNAs and upstream ORFs. This large human proteome catalog (available as an interactive web-based resource at http://www.humanproteomemap.org) will complement available human genome and transcriptome data to accelerate biomedical research in health and disease.
Plasma Proteome Database (PPD; http://www.plasmaproteomedatabase.org/) was initially described in the year 2005 as a part of Human Proteome Organization’s (HUPO’s) pilot initiative on Human Plasma Proteome Project. Since then, improvements in proteomic technologies and increased throughput have led to identification of a large number of novel plasma proteins. To keep up with this increase in data, we have significantly enriched the proteomic information in PPD. This database currently contains information on 10 546 proteins detected in serum/plasma of which 3784 have been reported in two or more studies. The latest version of the database also incorporates mass spectrometry-derived data including experimentally verified proteotypic peptides used for multiple reaction monitoring assays. Other novel features include published plasma/serum concentrations for 1278 proteins along with a separate category of plasma-derived extracellular vesicle proteins. As plasma proteins have become a major thrust in the field of biomarkers, we have enabled a batch-based query designated Plasma Proteome Explorer, which will permit the users in screening a list of proteins or peptides against known plasma proteins to assess novelty of their data set. We believe that PPD will facilitate both clinical and basic research by serving as a comprehensive reference of plasma proteins in humans and accelerate biomarker discovery and translation efforts.
Interleukin-33 (IL-33) is a member of the IL-1 family of cytokines that play a central role in the regulation of immune responses. Its release from epithelial and endothelial cells is mediated by pro-inflammatory cytokines, cell damage and by recognition of pathogen-associated molecular patterns (PAMPs). The activity of IL-33 is mediated by binding to the IL-33 receptor complex (IL-33R) and activation of NF-κB signaling via the classical MyD88/IRAK/TRAF6 module. IL-33 also induces the phosphorylation and activation of ERK1/2, JNK, p38 and PI3K/AKT signaling modules resulting in the production and release of pro-inflammatory cytokines. Aberrant signaling by IL-33 has been implicated in the pathogenesis of several acute and chronic inflammatory diseases, including asthma, atopic dermatitis, rheumatoid arthritis and ulcerative colitis among others. Considering the biomedical importance of IL-33, we developed a pathway resource of signaling events mediated by IL-33/IL-33R in this study. Using data mined from the published literature, we describe an integrated pathway reaction map of IL-33/IL-33R consisting of 681 proteins and 765 reactions. These include information pertaining to 19 physical interaction events, 740 enzyme catalysis events, 6 protein translocation events, 4 activation/inhibition events, 9 transcriptional regulators and 2492 gene regulation events. The pathway map is publicly available through NetPath ( http://www.netpath.org /), a resource of human signaling pathways developed previously by our group. This resource will provide a platform to the scientific community in facilitating identification of novel therapeutic targets for diseases associated with dysregulated IL-33 signaling. Database URL: http://www.netpath.org/pathways?path_id=NetPath_120 .
Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out transcriptome sequencing and deep proteome profiling of multiple anatomically distinct sites. Based on transcriptomic data alone, we identified and corrected 535 events of incomplete genome assembly involving 1196 scaffolds and 868 protein-coding gene models. This proteogenomic approach enabled us to add 365 genes that were missed during genome annotation and identify 917 gene correction events through discovery of 151 novel exons, 297 protein extensions, 231 exon extensions, 192 novel protein start sites, 19 novel translational frames, 28 events of joining of exons, and 76 events of joining of adjacent genes as a single gene. Incorporation of proteomic evidence allowed us to change the designation of more than 87 predicted “noncoding RNAs” to conventional mRNAs coded by protein-coding genes. Importantly, extension of the newly corrected genome assemblies and gene models to 15 other newly assembled Anopheline genomes led to the discovery of a large number of apparent discrepancies in assembly and annotation of these genomes. Our data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes.
We previously developed NetPath as a resource for comprehensive manually curated signal transduction pathways. The pathways in NetPath contain a large number of molecules and reactions which can sometimes be difficult to visualize or interpret given their complexity. To overcome this potential limitation, we have developed a set of more stringent curation and inclusion criteria for pathway reactions to generate high-confidence signaling maps. NetSlim is a new resource that contains this ‘core’ subset of reactions for each pathway for easy visualization and manipulation. The pathways in NetSlim are freely available at http://www.netpath.org/netslim.Database URL: www.netpath.org/netslim
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.