2021
DOI: 10.1093/dnares/dsab007
|View full text |Cite
|
Sign up to set email alerts
|

Understanding small ORF diversity through a comprehensive transcription feature classification

Abstract: Small open reading frames (small ORFs/sORFs/smORFs) are potentially coding sequences smaller than 100 codons that have historically been considered junk DNA by gene prediction software and in annotation screening; however, the advent of next-generation sequencing has contributed to the deeper investigation of junk DNA regions and their transcription products, resulting in the emergence of smORFs as a new focus of interest in systems biology. Several smORF peptides were recently reported in noncanonical mRNAs a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(14 citation statements)
references
References 241 publications
(307 reference statements)
0
12
0
Order By: Relevance
“…We predicted protein-coding genes in each assembled genome based on homology to known genes and pseudogenes in GRCg6a and GRCg7b/w assemblies (Table S2) as well as RNA-seq data collected in various tissues in the four indigenous chickens (Materials and Methods). We predicted a total of 17,497~17,718 protein-coding genes in each of the four assembled indigenous genomes (Table 4), similar to that annotated in GRCg6a (17,485), but fewer than those annotated in GRCg7b (18,024) and GRCg7w (18,016). Speci cally, we predicted 16,917~17,141 genes in each genome based on homology to known genes, of which 16,270~16,668 have an intact ORF (intact genes) (Table 4), and 473~647 contain either a nonsense or an ORF shift mutation that cannot be fully supported by short DNA reads from the chicken.…”
Section: New Protein-coding Genes Are Found In Indigenous Chicken Gen...mentioning
confidence: 97%
See 1 more Smart Citation
“…We predicted protein-coding genes in each assembled genome based on homology to known genes and pseudogenes in GRCg6a and GRCg7b/w assemblies (Table S2) as well as RNA-seq data collected in various tissues in the four indigenous chickens (Materials and Methods). We predicted a total of 17,497~17,718 protein-coding genes in each of the four assembled indigenous genomes (Table 4), similar to that annotated in GRCg6a (17,485), but fewer than those annotated in GRCg7b (18,024) and GRCg7w (18,016). Speci cally, we predicted 16,917~17,141 genes in each genome based on homology to known genes, of which 16,270~16,668 have an intact ORF (intact genes) (Table 4), and 473~647 contain either a nonsense or an ORF shift mutation that cannot be fully supported by short DNA reads from the chicken.…”
Section: New Protein-coding Genes Are Found In Indigenous Chicken Gen...mentioning
confidence: 97%
“…Moreover, 756 (56.6%) of the 1,335 new genes have a CDS length of 100~300bp, while all of our 1,420 new gene have a CDS length longer than 300bp (Figure 4f). Since a CDS with a total length shorter than 300bp is generally considered as a mini-ORF 18 , we do not consider mini-ORF as genes. A total of 660 (49.4%) of the 1,335 new genes can be mapped to at least one of our indigenous chicken genomes with an identity greater than 98.5%, thus we have assembled their loci in at least one of the four genomes.…”
Section: New Protein-coding Genes Are Found In Indigenous Chicken Gen...mentioning
confidence: 99%
“…In fact, noncoding regions can be seen as a reservoir of unselected sequences, hosting thousands of small Open Reading Frames (ORFs) that could give rise to novel products if translated [1][2][3][4][5][6] . Precisely, OMICS technologies have provided a huge amount of data revealing the "omnipresence" of biological noise which has turned out to result from the pervasivity of biological processes.…”
Section: Introductionmentioning
confidence: 99%
“…[22] Identification of translated smORFs remains technically challenging and therefore the mechanisms and modes of action of these micro-peptides are poorly understood. [23–25]…”
Section: Introductionmentioning
confidence: 99%