Investigations of missing proteins (MPs) are being endorsed by many bioanalytical strategies. We proposed that proteogenomics of testis tissue was a feasible approach to identify more MPs because testis tissues have higher gene expression levels. Here we combined proteomics and transcriptomics to survey gene expression in human testis tissues from three post-mortem individuals. Proteins were extracted and separated with glycine- and tricine-SDS-PAGE. A total of 9597 protein groups were identified; of these, 166 protein groups were listed as MPs, including 138 groups (83.1%) with transcriptional evidence. A total of 2948 proteins are designated as MPs, and 5.6% of these were identified in this study. The high incidence of MPs in testis tissue indicates that this is a rich resource for MPs. Functional category analysis revealed that the biological processes that testis MPs are mainly involved in are sexual reproduction and spermatogenesis. Some of the MPs are potentially involved in tumorgenesis in other tissues. Therefore, this proteogenomics analysis of individual testis tissues provides convincing evidence of the discovery of MPs. All mass spectrometry data from this study have been deposited in the ProteomeXchange (data set identifier PXD002179).
Since 2012, missing proteins (MPs) investigation has been one of the critical missions of Chromosome-Centric Human Proteome Project (C-HPP) through various biochemical strategies. On the basis of our previous testis MPs study, faster scanning and higher resolution mass-spectrometry-based proteomics might be conducive to MPs exploration, especially for low-abundance proteins. In this study, Q-Exactive HF (HF) was used to survey proteins from the same testis tissues separated by two separating methods (tricine- and glycine-SDS-PAGE), as previously described. A total of 8526 proteins were identified, of which more low-abundance proteins were uniquely detected in HF data but not in our previous LTQ Orbitrap Velos (Velos) reanalysis data. Further transcriptomics analysis showed that these uniquely identified proteins by HF also had lower expression at the mRNA level. Of the 81 total identified MPs, 74 and 39 proteins were listed as MPs in HF and Velos data sets, respectively. Among the above MPs, 47 proteins (43 neXtProt PE2 and 4 PE3) were ranked as confirmed MPs after verifying with the stringent spectra match and isobaric and single amino acid variants filtering. Functional investigation of these 47 MPs revealed that 11 MPs were testis-specific proteins and 7 MPs were involved in spermatogenesis process. Therefore, we concluded that higher scanning speed and resolution of HF might be factors for improving the low-abundance MP identification in future C-HPP studies. All mass-spectrometry data from this study have been deposited in the ProteomeXchange with identifier PXD004092.
The developed acetylated LysargiNase (Ac-LysargiNase), with superior activity and stability, provides complementary ion types compared with trypsin for MS/MS analysis. Based on the two mirror proteases, we developed a novel de novo sequencing algorithm, pNovoM, which performed with higher efficiency and accuracy compared with other software tools.
Hepatitis B virus X protein (HBx) participates in the occurrence and development processes of hepatocellular carcinoma (HCC) as a multifunctional regulation factor. However, the underlying molecular mechanism remains obscure. Here, we describe the use of p21HBx/+ mouse and SILAM (Stable Isotope Labeling in Mammals) strategy to define the pathological mechanisms for the occurrence and development of HBx induced liver cancer. We systematically compared a series of proteome samples from regular mice, 12- and 24-month old p21HBx/+ mice representing the inflammation and HCC stages of liver disease respectively and their nontransgenic wild-type (WT) littermates. Totally we identified 22 and 97 differentially expressed proteins out of a total of 2473 quantified proteins. Bioinformatics analysis suggested that the lipid metabolism and CDC42-induced cytoskeleton remodeling pathways were strongly activated by the HBx transgene. Interestingly, the protein-protein interaction MS study revealed that HBx directly interacted with multiple proteins in these two pathways. The same effect of up-regulation of cytoskeleton and lipid metabolism related proteins, including CDC42, CFL1, PPARγ and ADFP, was also observed in the Huh-7 cells transfected with HBx. More importantly, CFL1 and ADFP were specifically accumulated in HBV-associated HCC (HBV-HCC) patient samples, and their expression levels were positively correlated with the severity of HBV-related liver disease. These results provide evidence that HBx induces the dysregulation of cytoskeleton remodeling and lipid metabolism and leads to the occurrence and development of liver cancer. The CFL1 and ADFP might be served as potential biomarkers for prognosis and diagnosis of HBV-HCC.
As part of the Chromosome-Centric Human Proteome Project (C-HPP) mission, laboratories all over the world have tried to map the entire missing proteins (MPs) since 2012. On the basis of the first and second Chinese Chromosome Proteome Database (CCPD 1.0 and 2.0) studies, we developed systematic enrichment strategies to identify MPs that fell into four classes: (1) low molecular weight (LMW) proteins, (2) membrane proteins, (3) proteins that contained various post-translational modifications (PTMs), and (4) nucleic acid-associated proteins. Of 8845 proteins identified in 7 data sets, 79 proteins were classified as MPs. Among data sets derived from different enrichment strategies, data sets for LMW and PTM yielded the most novel MPs. In addition, we found that some MPs were identified in multiple-data sets, which implied that tandem enrichments methods might improve the ability to identify MPs. Moreover, low expression at the transcription level was the major cause of the "missing" of these MPs; however, MPs with higher expression level also evaded identification, most likely due to other characteristics such as LMW, high hydrophobicity and PTM. By combining a stringent manual check of the MS2 spectra with peptides synthesis verification, we confirmed 30 MPs (neXtProt PE2 ∼ PE4) and 6 potential MPs (neXtProt PE5) with authentic MS evidence. By integrating our large-scale data sets of CCPD 2.0, the number of identified proteins has increased considerably beyond simulation saturation. Here, we show that special enrichment strategies can break through the data saturation bottleneck, which could increase the efficiency of MP identification in future C-HPP studies. All 7 data sets have been uploaded to ProteomeXchange with the identifier PXD002255.
A membrane protein enrichment method composed of ultracentrifugation and detergent-based extraction was first developed based on MCF7 cell line. Then, in-solution digestion with detergents and eFASP (enhanced filter-aided sample preparation) with detergents were compared with the time-consuming in-gel digestion method. Among the in-solution digestion strategies, the eFASP combined with RapiGest identified 1125 membrane proteins. Similarly, the eFASP combined with sodium deoxycholate identified 1069 membrane proteins; however, the in-gel digestion characterized 1091 membrane proteins. Totally, with the five digestion methods, 1390 membrane proteins were identified with ≥1 unique peptides, among which 1345 membrane proteins contain unique peptides ≥2. This is the biggest membrane protein data set for MCF7 cell line and even breast cancer tissue samples. Interestingly, we identified 13 unique peptides belonging to 8 missing proteins (MPs). Finally, eight unique peptides were validated by synthesized peptides. Two proteins were confirmed as MPs, and another two proteins were candidate detections.
The r-Ac-trypsin described here is a recombinant product. In addition it showed similar or superior properties such as stability activity and specificity to commercial products. It can be used in peptide sample preparation in proteomics studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.