Cuitong He scite author profile

Cuitong He

5Publications

71Citation Statements Received

237Citation Statements Given

How they've been cited

How they cite others

211

237

Affiliations

Center for Life Sciences, Peking University, Anhui Medical University

Publications

Order By: Most citations

Enrichment-Based Proteogenomics Identifies Microproteins, Missing Proteins, and Novel smORFs in Saccharomyces cerevisiae

Jia

Zhang

et al. 2018

J. Proteome Res.

View full text Add to dashboard Cite

Microproteins are peptides composed of 100 amino acids (AA) or fewer, encoded by small open reading frames (smORFs). It has been demonstrated that microproteins participate in and regulate a wide range of functions in cells. However, the annotation and identification of microproteins is challenging in part owing to their low molecular weight, low abundancy, and hydrophobicity. These factors have led to the unannotation of smORFs in genome processing and have made their identification at the protein level difficult. Large-scale enrichment of microproteins in proteogenomics has made it possible to efficiently identify microproteins and discover unannotated smORFs in Saccharomyces cerevisiae. We integrated four microprotein-specific enrichment strategies to enhance coverage. We identified 117 microproteins, verified 31 missing proteins (MPs), and discovered 3 novel smORFs. In total, 31 proteins were confirmed as MPs by spectrum quality checking. Three novel smORFs (YKL104W-A, YHR052C-B, and YHR054C-B) were reserved after spectrum quality checking, peptide synthesizing, homologue matching, and so on. This study not only demonstrates that there are potential smORF candidates to be annotated in an extensively studied organism but also presents an efficient strategy for the discovery of small MPs. All MS data sets have been deposited to the ProteomeXchange with identifier PXD008586.

show abstract

Digging for Missing Proteins Using Low-Molecular-Weight Protein Enrichment and a “Mirror Protease” Strategy

Sun

Shi

et al. 2018

J. Proteome Res.

View full text Add to dashboard Cite

In 2012, the Chromosome-centric Human Proteome Project (C-HPP) launched an investigation for missing proteins (MPs) to complete the Human Proteome Project (HPP). The majority of the MPs were distributed in lowmolecular-weight (LMW) ranges, especially from 0 to 40 kDa. LMW protein identification is challenging, owing to their short length, low abundance, and hydrophobicity. Furthermore, many sequences from trypsin digestion are unlikely to yield detectable peptides or a reasonable quality of MS 2 spectrum. Therefore, we focused on small MPs by combining LMW protein enrichment and a pair of complementary proteases strategy with trypsin and LysargiNase for human testis samples. In-depth testis LMW protein profiling resulted in the identification of 4063 proteins, of which 2565 were LMW proteins and 1130 had pairs of peptides generated from both trypsin and LysargiNase. This provided additional mass spectral evidence of further verification of small MPs. Finally, two MPs were verified from the seven MP candidates. One of them, Q8N688, was verified with two series of continuous and complementary b/y-product ions from the pairs of spectra for tryptic and LysargiNase digested peptides after the "mirror spectrum" matching. This make the confident identification of the representative peptides for the target MPs. On the contrary, the two verified peptides for Q86WR6 were identified with the same strategy from the gel-separation and gelelution samples, respectively. Although the other five MP candidates showed high-quality spectra, they could not be sufficiently distinguished as PE1s and require further verification. All MS data sets have been deposited in the ProteomeXchange with identifier PXD010093.

show abstract

Proteogenomics Integrating Novel Junction Peptide Identification Strategy Discovers Three Novel Protein Isoforms of Human NHSL1 and EEF1B2

Guo

Tian

et al. 2021

J. Proteome Res.

View full text Add to dashboard Cite

In eukaryotes, alternative pre-mRNA splicing allows a single gene to encode different protein isoforms that function in many biological processes, and they are used as biomarkers or therapeutic targets for diseases. Although protein isoforms in the human genome are well annotated, we speculate that some low-abundance protein isoforms may still be under-annotated because most genes have a primary coding product and alternative protein isoforms tend to be under-expressed. A peptide coencoded by a novel exon and an annotated exon separated by an intron is known as a novel junction peptide. In the absence of known transcripts and homologous proteins, traditional whole-genome six-frame translation-based proteogenomics cannot identify novel junction peptides, and it cannot capture novel alternative splice sites. In this article, we first propose a strategy and tool for identifying novel junction peptides, called CJunction, which we then integrate into a proteogenomics process specifically designed for novel protein isoform discovery and apply to the analysis of a deep-coverage HeLa mass spectrometry data set with identifier PXD004452 in ProteomeXchange. We succeeded in identifying and validating three novel protein isoforms of two functionally important genes, NHSL1 (causative gene of Nance-Horan syndrome) and EEF1B2 (translation elongation factor), which validate our hypothesis. These novel protein isoforms have significant sequence differences from the annotated gene-coding products introduced by the novel N-terminal, suggesting that they may play importantly different functions.

show abstract

Advances in small protein identification

He¹,

Zhang²,

Xu³

2018

Sci. Sin.-Vitae

View full text Add to dashboard Cite

show abstract

Novel Proteoform Discovery by Precise Semi-De Novo Sequencing of Novel Junction Peptides

Wong

2023

Anal. Chem.

View full text Add to dashboard Cite

Alternative splicing allows a small number of human genes to encode large amounts of proteoforms that play essential roles in normal and disease physiology. Some low-abundance proteoforms may remain undiscovered due to limited detection and analysis capabilities. Peptides coencoded by novel exons and annotated exons separated by introns are called novel junction peptides, which are the key to identifying novel proteoforms. Traditional de novo sequencing does not take into account the specificity in the composition of the novel junction peptide and is therefore not as accurate. We first developed a novel de novo sequencing algorithm, CNovo, which outperformed the mainstream PEAKS and Novor in all six test sets. We then built on CNovo to develop a semi-de novo sequencing algorithm, SpliceNovo, specifically for identifying novel junction peptides. SpliceNovo identifies junction peptides with much higher accuracy than CNovo, CJunction, PEAKS, and Novor. Of course, it is also possible to replace the built-in CNovo in SpliceNovo with other more accurate de novo sequencing algorithms to further improve its performance. We also successfully identified and validated two novel proteoforms of the human EIF4G1 and ELAVL1 genes by SpliceNovo. Our results significantly improve the ability to discover novel proteoforms through de novo sequencing.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Cuitong He

Enrichment-Based Proteogenomics Identifies Microproteins, Missing Proteins, and Novel smORFs in Saccharomyces cerevisiae

Digging for Missing Proteins Using Low-Molecular-Weight Protein Enrichment and a “Mirror Protease” Strategy

Proteogenomics Integrating Novel Junction Peptide Identification Strategy Discovers Three Novel Protein Isoforms of Human NHSL1 and EEF1B2

Advances in small protein identification

Novel Proteoform Discovery by Precise Semi-De Novo Sequencing of Novel Junction Peptides

Contact Info

Product

Resources

About