Small open reading frames (smORFs) are short DNA sequences that are able to encode small peptides of less than 100 amino acids. Study of these elements has been neglected despite thousands existing in our genomes. We and others previously showed that peptides as short as 11 amino acids are translated and provide essential functions during insect development. Here, we describe two peptides of less than 30 amino acids regulating calcium transport, and hence influencing regular muscle contraction, in the Drosophila heart. These peptides seem conserved for more than 550 million years in a range of species from flies to humans, in which they have been implicated in cardiac pathologies. Such conservation suggests that the mechanisms for heart regulation are ancient and that smORFs may be a fundamental genome component that should be studied systematically.
BackgroundThe relationship between DNA sequence and encoded information is still an unsolved puzzle. The number of protein-coding genes in higher eukaryotes identified by genome projects is lower than was expected, while a considerable amount of putatively non-coding transcription has been detected. Functional small open reading frames (smORFs) are known to exist in several organisms. However, coding sequence detection methods are biased against detecting such very short open reading frames. Thus, a substantial number of non-canonical coding regions encoding short peptides might await characterization.ResultsUsing bio-informatics methods, we have searched for smORFs of less than 100 amino acids in the putatively non-coding euchromatic DNA of Drosophila melanogaster, and initially identified nearly 600,000 of them. We have studied the pattern of conservation of these smORFs as coding entities between D. melanogaster and Drosophila pseudoobscura, their presence in syntenic and in transcribed regions of the genome, and their ratio of conservative versus non-conservative nucleotide changes. For negative controls, we compared the results with those obtained using random short sequences, while a positive control was provided by smORFs validated by proteomics data.ConclusionsThe combination of these analyses led us to postulate the existence of at least 401 functional smORFs in Drosophila, with the possibility that as many as 4,561 such functional smORFs may exist.
Hundreds of previously unidentified functional small peptides could exist in most genomes, but these sequences have been generally overlooked. The discovery of genes encoding small peptides with important functions in different organisms, has ignited the interest in these sequences, and led to an increasing amount of effort towards their identification.Here, we review the advances, both, computational, and biochemical, that are leading the way in the discovery of putatively functional smORFs, as well as the functional studies that have been carried out as a consequence of these searches. The evidence suggests that smORFs form a substantial part of our genomes, and that their encoded peptides could have important functions in a variety of cellular functions.
Translation of hundreds of small ORFs (smORFs) of less than 100 amino acids has recently been revealed in vertebrates and Drosophila. Some of these peptides have essential and conserved cellular functions. In Drosophila, we have predicted a particular smORF class encoding ~80 aa hydrophobic peptides, which may function in membranes and cell organelles. Here, we characterise hemotin, a gene encoding an 88aa transmembrane smORF peptide localised to early endosomes in Drosophila macrophages. hemotin regulates endosomal maturation during phagocytosis by repressing the cooperation of 14-3-3ζ with specific phosphatidylinositol (PI) enzymes. hemotin mutants accumulate undigested phagocytic material inside enlarged endo-lysosomes and as a result, hemotin mutants have reduced ability to fight bacteria, and hence, have severely reduced life span and resistance to infections. We identify Stannin, a peptide involved in organometallic toxicity, as the Hemotin functional homologue in vertebrates, showing that this novel regulator of phagocytic processing is widely conserved, emphasizing the significance of smORF peptides in cell biology and disease.
Up to 80% of individuals with myotonic dystrophy type 1 (DM1) will develop cardiac abnormalities at some point during the progression of their disease, the most common of which is heart blockage of varying degrees. Such blockage is characterized by conduction defects and supraventricular and ventricular tachycardia, and carries a high risk of sudden cardiac death. Despite its importance, very few animal model studies have focused on the heart dysfunction in DM1. Here, we describe the characterization of the heart phenotype in a Drosophila model expressing pure expanded CUG repeats under the control of the cardiomyocyte-specific driver GMH5-Gal4. Morphologically, expression of 250 CUG repeats caused abnormalities in the parallel alignment of the spiral myofibrils in dissected fly hearts, as revealed by phalloidin staining. Moreover, combined immunofluorescence and in situ hybridization of Muscleblind and CUG repeats, respectively, confirmed detectable ribonuclear foci and Muscleblind sequestration, characteristic features of DM1, exclusively in flies expressing the expanded CTG repeats. Similarly to what has been reported in humans with DM1, heart-specific expression of toxic RNA resulted in reduced survival, increased arrhythmia, altered diastolic and systolic function, reduced heart tube diameters and reduced contractility in the model flies. As a proof of concept that the fly heart model can be used for in vivo testing of promising therapeutic compounds, we fed flies with pentamidine, a compound previously described to improve DM1 phenotypes. Pentamidine not only released Muscleblind from the CUG RNA repeats and reduced ribonuclear formation in the Drosophila heart, but also rescued heart arrhythmicity and contractility, and improved fly survival in animals expressing 250 CUG repeats.
Long noncoding RNAs (lncRNAs) are transcripts longer than 200 nucleotides but lacking canonical coding sequences. Apparently unable to produce peptides, lncRNA function seems to rely only on RNA expression, sequence and structure. Here, we exhaustively detect in-vivo translation of small open reading frames (small ORFs) within lncRNAs using Ribosomal profiling during Drosophila melanogaster embryogenesis. We show that around 30% of lncRNAs contain small ORFs engaged by ribosomes, leading to regulated translation of 100 to 300 micropeptides. We identify lncRNA features that favour translation, such as cistronicity, Kozak sequences, and conservation. For the latter, we develop a bioinformatics pipeline to detect small ORF homologues, and reveal evidence of natural selection favouring the conservation of micropeptide sequence and function across evolution. Our results expand the repertoire of lncRNA biochemical functions, and suggest that lncRNAs give rise to novel coding genes throughout evolution. Since most lncRNAs contain small ORFs with as yet unknown translation potential, we propose to rename them “long non-canonical RNAs”.
Small Open Reading Frames (smORFs) coding for peptides of less than 100 amino-acids are an enigmatic and pervasive gene class, found in the tens of thousands in metazoan genomes. Here we reveal a short 80 amino-acid peptide (Pegasus) which enhances Wingless/Wnt1 protein short-range diffusion and signalling. During Drosophila wing development, Wingless has sequential functions, including late induction of proneural gene expression and wing margin development. Pegasus mutants produce wing margin defects and proneural expression loss similar to those of Wingless. Pegasus is secreted, and co-localizes and co-immunoprecipitates with Wingless, suggesting their physical interaction. Finally, measurements of fixed and in-vivo Wingless gradients support that Pegasus increases Wingless diffusion in order to enhance its signalling. Our results unveil a new element in Wingless signalling and clarify the patterning role of Wingless diffusion, while corroborating the link between small open reading frame peptides, and regulation of known proteins with membrane-related functions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.