Small open reading frames (sORFs) have translational potential to produce peptides that play essential roles in various biological processes. Nevertheless, many sORF-encoded peptides (SEPs) are still on the prediction level. Here, we construct a strategy to analyze SEPs by combining top-down and de novo sequencing to improve SEP identification and sequence coverage. With de novo sequencing, we identified 1682 peptides mapping to 2544 human sORFs, which were all first characterized in this work. Two-thirds of these new sORFs have reading frame shifts and use a non-ATG start codon. The top-down approach identified 241 human SEPs, with high sequence coverage. The average length of the peptides from the bottom-up database search was 19 amino acids (AA); from de novo sequencing, it was 9 AA; and from the top-down approach, it was 25 AA. The longer peptide positively boosts the sequence coverage, more efficiently distinguishing SEPs from the known gene coding sequence. Top-down has the advantage of identifying peptides with sequential K/R or high K/R content, which is unfavorable in the bottom-up approach. Our method can explore new coding sORFs and obtain highly accurate sequences of their SEPs, which can also benefit future function research.
Small open reading frame-encoded peptides (SEPs) are
microproteins
with a length of 100 amino acids or less, which may play a critical
role in maintaining cell homeostasis under stress. Therefore, we used
mass spectrometry-based proteomics to explore microproteins potentially
involved in cellular stress responses in Saccharomyces
cerevisiae. A total of 225 microproteins with 1920
unique peptides were identified under six culture conditions: normal,
oxidation, starvation, ultraviolet radiation, heat shock, and heat
shock with starvation. Among these microproteins, we found 70 SEPs
with 75 unique peptides. The annotated microproteins are involved
in stress-related processes, such as cell redox reactions, cell wall
modification, protein folding and degradation, and DNA damage repair.
It suggests that SEPs may also play similar functions under stress
conditions. For example, SEP IP_008057, translated from a short coding
sequence of YJL159W, may play a role in heat shock.
This study identified stress-responsive SEPs in S.
cerevisiae and provided valuable information to determine
the functions of these proteins, which enrich the genome and proteome
of S. cerevisiae and show clues to
improving the stress tolerance of S. cerevisiae.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.