28Background: Upstream open reading frames (uORFs) are located in the 5′-untranslated regions of many 29 eukaryotic mRNAs, and some peptides encoded in these regions play important regulatory roles in controlling 30 main ORF (mORF) translation. To comprehensively identify uORFs encoding functional peptides, genome-wide 31 searches for uORFs with conserved peptide sequences (CPuORFs) have been conducted in various organisms 32 using comparative genomic approaches. However, in animals, CPuORFs have been identified only by comparing 33 uORF sequences between a limited number of closely related species, and it is unclear how many previously 34identified CPuORFs encode regulatory peptides. 35Results: Here, we conducted exhaustive genome-wide searches for animal CPuORFs conserved in various 36 taxonomic ranges, using the ESUCA pipeline, which we recently developed for efficient comprehensive 37 identification of CPuORFs. ESUCA can efficiently compare uORF sequences between an unlimited number of 38 species using BLAST and automatically determine the taxonomic ranges of sequence conservation for each 39CPuORF. By applying ESUCA to human, chicken, zebrafish, and fruit fly genomes, 1,430 (1,339 novel and 91 40 known) CPuORFs were identified. We examined the effects of 14 human CPuORFs on mORF translation using 41 a transient expression assay. Through this analysis, we identified six novel regulatory CPuORFs that repressed 42 mORF translation in a sequence-dependent manner, all of which were conserved beyond Amniota. 43
Conclusions:We discovered a much higher number of animal CPuORFs than previously identified. 44Furthermore, our results suggest that human CPuORFs conserved beyond Amniota are more likely to encode 45 regulatory peptides than those conserved in narrower taxonomic ranges. 46 47 -3 -Determination of the taxonomic range of uORF sequence conservation for animal CPuORFs 104