Wild populations of the house mouse (Mus musculus) represent the raw genetic material for the classical inbred strains in biomedical research and are a major model system for evolutionary biology. We provide whole genome sequencing data of individuals representing natural populations of M. m. domesticus (24 individuals from 3 populations), M. m. helgolandicus (3 individuals), M. m. musculus (22 individuals from 3 populations) and M. spretus (8 individuals from one population). We use a single pipeline to map and call variants for these individuals and also include 10 additional individuals of M. m. castaneus for which genomic data are publically available. In addition, RNAseq data were obtained from 10 tissues of up to eight adult individuals from each of the three M. m. domesticus populations for which genomic data were collected. Data and analyses are presented via tracks viewable in the UCSC or IGV genome browsers. We also provide information on available outbred stocks and instructions on how to keep them in the laboratory.
BackgroundNew gene emergence is so far assumed to be mostly driven by duplication and divergence of existing genes. The possibility that entirely new genes could emerge out of the non-coding genomic background was long thought to be almost negligible. With the increasing availability of fully sequenced genomes across broad scales of phylogeny, it has become possible to systematically study the origin of new genes over time and thus revisit this question.ResultsWe have used phylostratigraphy to assess trends of gene evolution across successive phylogenetic phases, using mostly the well-annotated mouse genome as a reference. We find several significant general trends and confirm them for three other vertebrate genomes (humans, zebrafish and stickleback). Younger genes are shorter, both with respect to gene length, as well as to open reading frame length. They contain also fewer exons and have fewer recognizable domains. Average exon length, on the other hand, does not change much over time. Only the most recently evolved genes have longer exons and they are often associated with active promotor regions, i.e. are part of bidirectional promotors. We have also revisited the possibility that de novo evolution of genes could occur even within existing genes, by making use of an alternative reading frame (overprinting). We find several cases among the annotated Ensembl ORFs, where the new reading frame has emerged at a higher phylostratigraphic level than the original one. We discuss some of these overprinted genes, which include also the Hoxa9 gene where an alternative reading frame covering the homeobox has emerged within the lineage leading to rodents and primates (Euarchontoglires).ConclusionsWe suggest that the overall trends of gene emergence are more compatible with a de novo evolution model for orphan genes than a general duplication-divergence model. Hence de novo evolution of genes appears to have occurred continuously throughout evolutionary time and should therefore be considered as a general mechanism for the emergence of new gene functions.
The phenomenon of de novo gene birth from junk DNA is surprising, because random polypeptides are expected to be toxic. There are two conflicting views about how de novo gene birth is nevertheless possible: the continuum hypothesis invokes a gradual gene birth process, while the preadaptation hypothesis predicts that young genes will show extreme levels of gene-like traits. We show that intrinsic structural disorder conforms to the predictions of the preadaptation hypothesis and falsifies the continuum hypothesis, with all genes having higher levels than translated junk DNA, but young genes having the highest level of all. Results are robust to homology detection bias, to the non-independence of multiple members of the same gene family, and to the false positive annotation of protein-coding genes.
Orphan genes are genes that occur in specific evolutionary lineages without similarity to genes outside of these lineages and have, therefore, alternatively been named taxonomically restricted genes. They were so far considered to emerge through duplication–divergence processes, but it is now becoming clear that they can also arise de novo out of noncoding deoxyribonucleic acid (DNA). This latter process may even occur much more frequently than previously assumed. It appears that genomes harbour many transcripts in a transition stage from nonfunctional to functional genes, also known as protogenes, which are exposed to evolutionary testing and can become fixed when they turn out to be useful. Orphan genes may have played key roles in generating lineage‐specific adaptations and could be a continuous source of evolutionary novelties. Their existence suggests that functional ribonucleic acids (RNAs) and proteins can relatively easily arise out of random nucleotide sequences, although these processes still need to be experimentally explored. Key Concepts: Orphan genes, or taxonomically restricted genes, have arisen at all levels of the phylogenetic hierarchy. All genes that cannot be traced to the first cellular ancestor are orphan genes in some lineages. New genes may not only arise through gene duplication, but also through de novo evolution. Spurious transcripts can give rise to protogenes, from which new functional genes evolve. Emergence of new genes from protogenes is an active process in all extant genomes. New genes may first act as noncoding RNAs before obtaining a functional reading frame. Overprinting of existing reading frames with new reading frames is another possibility of de novo evolution of gene functions. Orphan genes may contribute to lineage‐specific adaptations. Orphan genes may carry information on the evolutionary past that can be harnessed by the phylostratigraphic approach. There is a continuous birth–death dynamics of gene evolution.
The transition to multicellularity has occurred numerous times in all domains of life, yet its initial steps are poorly understood. The volvocine green algae are a tractable system for understanding the genetic basis of multicellularity including the initial formation of cooperative cell groups. Here we report the genome sequence of the undifferentiated colonial alga, Gonium pectorale, where group formation evolved by co-option of the retinoblastoma cell cycle regulatory pathway. Significantly, expression of the Gonium retinoblastoma cell cycle regulator in unicellular Chlamydomonas causes it to become colonial. The presence of these changes in undifferentiated Gonium indicates extensive group-level adaptation during the initial step in the evolution of multicellularity. These results emphasize an early and formative step in the evolution of multicellularity, the evolution of cell cycle regulation, one that may shed light on the evolutionary history of other multicellular innovations and evolutionary transitions.
It is generally assumed that new genes arise through duplication and/or recombination of existing genes. The probability that a new functional gene could arise out of random non-coding DNA is so far considered to be negligible, since it seems unlikely that such a RNA or protein sequence could have an initial function that influences the fitness of an organism. We have here tested this question systematically, by expressing clones with random sequences in E . coli and subjecting them to competitive growth. Contrary to expectations, we find that random sequences with bioactivity are not rare. In our experiments we find that up to 25% of the evaluated clones enhance the growth rate of their cells and up to 52% inhibit growth. Testing of individual clones in competition assays confirms their activity and provides an indication that their activity could be exerted either by the transcribed RNA or the translated peptide. This suggests that transcribed and translated random parts of the genome could indeed have a high potential to become functional. The results also suggest that random sequences may become an effective new source of molecules for studying cellular functions, as well as for pharmacological activity screening.
In life, genetic and epigenetic networks precisely coordinate the expression of genes—but in death, it is not known if gene expression diminishes gradually or abruptly stops or if specific genes and pathways are involved. We studied this by identifying mRNA transcripts that apparently increase in relative abundance after death, assessing their functions, and comparing their abundance profiles through postmortem time in two species, mouse and zebrafish. We found mRNA transcript profiles of 1063 genes became significantly more abundant after death of healthy adult animals in a time series spanning up to 96 h postmortem. Ordination plots revealed non-random patterns in the profiles by time. While most of these transcript levels increased within 0.5 h postmortem, some increased only at 24 and 48 h postmortem. Functional characterization of the most abundant transcripts revealed the following categories: stress, immunity, inflammation, apoptosis, transport, development, epigenetic regulation and cancer. The data suggest a step-wise shutdown occurs in organismal death that is manifested by the apparent increase of certain transcripts with various abundance maxima and durations.
Deep sequencing analyses have shown that a large fraction of genomes is transcribed, but the significance of this transcription is much debated. Here, we characterize the phylogenetic turnover of poly-adenylated transcripts in a comprehensive sampling of taxa of the mouse (genus Mus), spanning a phylogenetic distance of 10 Myr. Using deep RNA sequencing we find that at a given sequencing depth transcriptome coverage becomes saturated within a taxon, but keeps extending when compared between taxa, even at this very shallow phylogenetic level. Our data show a high turnover of transcriptional states between taxa and that no major transcript-free islands exist across evolutionary time. This suggests that the entire genome can be transcribed into poly-adenylated RNA when viewed at an evolutionary time scale. We conclude that any part of the non-coding genome can potentially become subject to evolutionary functionalization via de novo gene evolution within relatively short evolutionary time spans.DOI: http://dx.doi.org/10.7554/eLife.09977.001
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.