Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA

Takahashi, Hiro; Hayashi, Noriya; Hiragori, Yuta; Sasaki, Shun; Motomura, Taichiro; Yamashita, Yoshihisa; Naito, Shuichi; Takahashi, Atsushi; Fuse, Kazuyuki; Satou, Kenji; Endo, Toshinori; Kojima, Shoko; Onouchi, Hitoshi

doi:10.1186/s12864-020-6662-5

Cited by 18 publications

(61 citation statements)

References 57 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…www.nature.com/scientificreports/ Transcript base sequence dataset construction from EST/TSA/RefSeq RNA (step 0.2). To identify animal CPuORFs, data preparation for ESUCA (step 0.2) was conducted as described in our previous study 32 . We conducted data preparation for ESUCA to identify animal CPuORFs.…”

Section: Transcript Dataset Construction Based On Genome Information mentioning

confidence: 99%

See 1 more Smart Citation

Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes

Takahashi

Miyaki

Onouchi

et al. 2020

Sci Rep

Self Cite

View full text Add to dashboard Cite

Upstream open reading frames (uORFs) are present in the 5′-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1517 (1373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.

show abstract

Section: Transcript Dataset Construction Based On Genome Information mentioning

confidence: 99%

“…Taxonomy datasets derived from EST/TSA/RefSeq databases were used at steps 4.3 and 6 of ESUCA. See the Materials and Methods in our previous study 32 for details.…”

Section: Transcript Dataset Construction Based On Genome Information mentioning

confidence: 99%

Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes

Takahashi

Miyaki

Onouchi

et al. 2020

Sci Rep

Self Cite

View full text Add to dashboard Cite

show abstract

“…To comprehensively identify uORFs encoding functional peptides or proteins, genome-wide searches for uORFs with conserved peptide sequences (CPuORFs) have been conducted using comparative genomic approaches in plants [27][28][29][30][31][32] . To date, 157 CPuORF families have been identified by comparing 5′-UTR sequences among plant species.…”

Section: Metmentioning

confidence: 99%

“…ESUCA has many unique functions 32 , such as efficient comparison of uORF sequences among an unlimited number of species using BLAST, automatic determination of taxonomic ranges of CPuORF sequence conservation, systematic calculation of K a /K s ratios of CPuORF sequences, and wide compatibility with any eukaryotic genome whose sequence database is registered in ENSEMBL 33 . By comparing uORF sequences from certain species and those from many other species whose transcript sequence databases are available, ESUCA enables more comprehensive identification of CPuORFs conserved in various taxonomic ranges than conventional comparative genomic approaches, in which uORF sequences are compared among limited numbers of selected species.…”

Section: Metmentioning

confidence: 99%

See 1 more Smart Citation

Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes

Takahashi

Miyaki

Onouchi

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

28Background: Upstream open reading frames (uORFs) are located in the 5′-untranslated regions of many 29 eukaryotic mRNAs, and some peptides encoded in these regions play important regulatory roles in controlling 30 main ORF (mORF) translation. To comprehensively identify uORFs encoding functional peptides, genome-wide 31 searches for uORFs with conserved peptide sequences (CPuORFs) have been conducted in various organisms 32 using comparative genomic approaches. However, in animals, CPuORFs have been identified only by comparing 33 uORF sequences between a limited number of closely related species, and it is unclear how many previously 34identified CPuORFs encode regulatory peptides. 35Results: Here, we conducted exhaustive genome-wide searches for animal CPuORFs conserved in various 36 taxonomic ranges, using the ESUCA pipeline, which we recently developed for efficient comprehensive 37 identification of CPuORFs. ESUCA can efficiently compare uORF sequences between an unlimited number of 38 species using BLAST and automatically determine the taxonomic ranges of sequence conservation for each 39CPuORF. By applying ESUCA to human, chicken, zebrafish, and fruit fly genomes, 1,430 (1,339 novel and 91 40 known) CPuORFs were identified. We examined the effects of 14 human CPuORFs on mORF translation using 41 a transient expression assay. Through this analysis, we identified six novel regulatory CPuORFs that repressed 42 mORF translation in a sequence-dependent manner, all of which were conserved beyond Amniota. 43 Conclusions:We discovered a much higher number of animal CPuORFs than previously identified. 44Furthermore, our results suggest that human CPuORFs conserved beyond Amniota are more likely to encode 45 regulatory peptides than those conserved in narrower taxonomic ranges. 46 47 -3 -Determination of the taxonomic range of uORF sequence conservation for animal CPuORFs 104

show abstract

Genome-wide identification of Arabidopsis non-AUG-initiated upstream ORFs with evolutionarily conserved regulatory sequences that control protein expression levels

et al. 2022

Self Cite

View full text Add to dashboard Cite

Upstream open reading frames (uORFs) are short ORFs found in the 5′-UTRs of many eukaryotic transcripts and can influence the translation of protein-coding main ORFs (mORFs). Recent genome-wide ribosome profiling studies have revealed that thousands of uORFs initiate translation at non-AUG start codons. However, the physiological significance of these non-AUG uORFs has so far been demonstrated for only a few of them. It is conceivable that physiologically important non-AUG uORFs are evolutionarily conserved across species. In this study, using a combination of bioinformatics and experimental approaches, we searched the Arabidopsis genome for non-AUG-initiated uORFs with conserved sequences that control the expression of the mORF-encoded proteins. As a result, we identified four novel regulatory non-AUG uORFs. Among these, two exerted repressive effects on mORF expression in an amino acid sequence-dependent manner. These two non-AUG uORFs are likely to encode regulatory peptides that cause ribosome stalling, thereby enhancing their repressive effects. In contrast, one of the identified regulatory non-AUG uORFs promoted mORF expression by alleviating the inhibitory effect of a downstream AUG-initiated uORF. These findings provide insights into the mechanisms that enable non-AUG uORFs to play regulatory roles despite their low translation initiation efficiencies.

show abstract

Comprehensive genome-wide identification of angiosperm upstream ORFs with peptide sequences conserved in various taxonomic ranges using a novel pipeline, ESUCA

Cited by 18 publications

References 57 publications

Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes

Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes

Exhaustive identification of conserved upstream open reading frames with potential translational regulatory functions from animal genomes

Genome-wide identification of Arabidopsis non-AUG-initiated upstream ORFs with evolutionarily conserved regulatory sequences that control protein expression levels

Contact Info

Product

Resources

About