Motivation: Proteogenomics has been well accepted as a tool to discover novel genes. In most conventional proteogenomic studies, a global false discovery rate is used to filter out false positives for identifying credible novel peptides. However, it has been found that the actual level of false positives in novel peptides is often out of control and behaves differently for different genomes.Results: To quantitatively model this problem, we theoretically analyze the subgroup false discovery rates of annotated and novel peptides. Our analysis shows that the annotation completeness ratio of a genome is the dominant factor influencing the subgroup FDR of novel peptides. Experimental results on two real datasets of Escherichia coli and Mycobacterium tuberculosis support our conjecture.Contact:
yfu@amss.ac.cn or xupingghy@gmail.com or smhe@ict.ac.cnSupplementary information: Supplementary data are available at Bioinformatics online.
The developed acetylated LysargiNase (Ac-LysargiNase), with superior activity and stability, provides complementary ion types compared with trypsin for MS/MS analysis. Based on the two mirror proteases, we developed a novel de novo sequencing algorithm, pNovoM, which performed with higher efficiency and accuracy compared with other software tools.
Trypsin specifically cleaves the C-terminus of lysine and arginine residues but often fails to cleave modified lysines, such as ubiquitination, therefore resulting in the uncleaved K-ε-GG peptides. Therefore, the cleaved ubiquitinated peptide identification was often regarded as false positives and discarded. Interestingly, unexpected cleavage at the K48-linked ubiquitin chain has been reported, suggesting the latent ability of trypsin to cleave ubiquitinated lysine residues. However, it remains unclear whether other trypsin-cleavable ubiquitinated sites are present. In this study, we verified the ability of trypsin in cleaving K6 and K63 besides K48 chains. The uncleaved K-ε-GG peptide was quickly and efficiently generated during trypsin digestion, whereas cleaved ones were produced with much lower efficiency. Then, the K-ε-GG antibody was proved to efficiently enrich the cleaved K-ε-GG peptides and several published large-scale ubiquitylation datasets were re-analyzed to interrogate the cleaved sequence features. In total, more than 2400 cleaved ubiquitinated peptides were identified in the K-ε-GG and UbiSite antibody-based datasets. The frequency of lysine upstream of the cleaved modified K was significantly enriched. The kinetic activity of trypsin in cleaving ubiquitinated peptides was further elucidated. We suggest that the cleaved K-ε-GG sites with high post-translational modification probability (≥0.75) should be considered as true positives in future ubiquitome analyses.
Ubiquitin ligases (E3s) serve as key regulators for the ubiquitylation-mediated pathway. The identification of the corresponding relationship between E3 and its substrates is challenging but required for understanding the regulatory network of ubiquitylation. The low abundance of ubiquitinated conjugates and high redundancy of E3 substrate regulation made the screening pretty hard. Herein, we combined SILAC-based quantitative proteomics with two contrary genetic methods (overexpression and knockout) in theory for E3 (Hrt3, the F-box subunit of the SCF complex) substrate screening. The knockout method could not overcome the constraint mentioned above, while the overexpression approach turned on the access to the potential substrates of E3. Subsequently, we obtained 77 candidates, which are involved in many critical biological processes and need to be verified in the future. Within these candidates, we confirmed the relationship between one of the candidates Nce103 and Hrt3 and linked Hrt3 with oxygen sensitivity and oxidative stress response in which Nce103 took part as well. This research is also beneficial for understanding the impact of oxygen supply on regulation of yeast growth through the ubiquitination of Nce103.
The Chromosome-Centric Human Proteome
Project (C-HPP) was launched
in 2012 to perfect the annotation of human protein existence by identifying
stronger evidence of the expression of missing proteins (MPs) at the
protein level. After an 8 year effort all over the world, the number
of MPs in the neXtProt database significantly decreased from 5511
(2012-02-24) to 1899 (2020-01-17). It is now more difficult to provide
confident evidence of the remaining MPs because of their specific
characteristics, including low abundance, low molecular weight, unexpected
modifications, transmembrane structure, tissue-expression specificity,
and so on. A higher resolution mass spectrometry (MS) interpretation
engine might provide an opportunity to identify these buried MPs in
complex samples by the combination with multi-tissue large-scale proteomics.
In this study, open-pFind was used to dig MPs from 20 pairs of healthy
human tissues by Wang et al. (Mol. Syst. Biol.201915e8503) combined with our large-scale testis data
set digested by three enzymes (Glu-C, Lys-C, and trypsin) with specificity
for different amino acid residues (J. Proteme Res.20191841894196). A total of 1 535 536
peptides with 17 283 477 peptide-spectrum matches (PSMs)
were mapped to 14 279 protein entries at a false discovery
rate of <1% at the PSM, peptide, and protein levels. A total of
103 MP candidates were identified, among which 86 candidates had more
unique peptide numbers compared with our single testis tissue. After
rigorous screening, manual checks, peptide synthesis, and matching
with documented peptides from PeptideAtlas, we validated four MPs,
P0C7T8 (duodenum and small intestine), Q8WWZ4 (stomach and rectum),
Q8IV35 (fallopian tube), and O14921 (tonsil), at the protein level.
All MS raw files have been deposited to the ProteomeXchange with identifier
PXD021391.
The ThUBD-HRP probe and the consequential developed TUF-WB+ method can detect polyubiquitination signal through one-step incubation with hypersensitivity, unbiased detection and a shorter operation time compared with the antibody method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.