Gene duplications and their subsequent divergence play an important part in the evolution of novel gene functions. Several models for the emergence, maintenance and evolution of gene copies have been proposed. However, a clear consensus on how gene duplications are fixed and maintained in genomes is lacking. Here, we present a comprehensive classification of the models that are relevant to all stages of the evolution of gene duplications. Each model predicts a unique combination of evolutionary dynamics and functional properties. Setting out these predictions is an important step towards identifying the main mechanisms that are involved in the evolution of gene duplications.
The origins of neural systems remain unresolved. In contrast to other basal metazoans, ctenophores, or comb jellies, have both complex nervous and mesoderm-derived muscular systems. These holoplanktonic predators also have sophisticated ciliated locomotion, behaviour and distinct development. Here, we present the draft genome of Pleurobrachia bachei, Pacific sea gooseberry, together with ten other ctenophore transcriptomes and show that they are remarkably distinct from other animal genomes in their content of neurogenic, immune and developmental genes. Our integrative analyses place Ctenophora as the earliest lineage within Metazoa. This hypothesis is supported by comparative analysis of multiple gene families, including the apparent absence of HOX genes, canonical microRNA machinery, and reduced immune complement in ctenophores. Although two distinct nervous systems are well-recognized in ctenophores, many bilaterian neuron-specific genes and genes of “classical” neurotransmitter pathways either are absent or, if present, are not expressed in neurons. Our metabolomic and physiological data are consistent with the hypothesis that ctenophore neural systems, and possibly muscle specification, evolved independently from those in other animals.
Fitness landscapes1,2, depictions of how genotypes manifest at the phenotypic level, form the basis for our understanding of many areas of biology2–7 yet their properties remain elusive. Studies addressing this issue often consider specific genes and their function as proxy for fitness2,4, experimentally assessing the impact on function of single mutations and their combinations in a specific sequence2,5,8–15 or in different sequences2,3,5,16–18. However, systematic high-throughput studies of the local fitness landscape of an entire protein have not yet been reported. Here, we chart an extensive region of the local fitness landscape of the green fluorescent protein from Aequorea victoria (avGFP) by measuring the native function, fluorescence, of tens of thousands of derivative genotypes of avGFP. We find that its fitness landscape is narrow, with half of genotypes with two mutations showing reduced fluorescence and half of genotypes with five mutations being completely non-fluorescent. The narrowness is enhanced by epistasis, which was detected in up to 30% of genotypes with multiple mutations arising mostly through the cumulative impact of slightly deleterious mutations causing a threshold-like decrease of protein stability and concomitant loss of fluorescence. A model of orthologous sequence divergence spanning hundreds of millions of years predicted the extent of epistasis in our data, indicating congruence between the fitness landscape properties at the local and global scales. The characterization of the local fitness landscape of avGFP has important implications for a number of fields including molecular evolution, population genetics and protein design.
A subject of extensive study in evolutionary theory has been the issue of how neutral, redundant copies can be maintained in the genome for long periods of time. Concurrently, examples of adaptive gene duplications to various environmental conditions in different species have been described. At this point, it is too early to tell whether or not a substantial fraction of gene copies have initially achieved fixation by positive selection for increased dosage. Nevertheless, enough examples have accumulated in the literature that such a possibility should be considered. Here, I review the recent examples of adaptive gene duplications and make an attempt to draw generalizations on what types of genes may be particularly prone to be selected for under certain environmental conditions. The identification of copy-number variation in ecological field studies of species adapting to stressful or novel environmental conditions may improve our understanding of gene duplications as a mechanism of adaptation and its relevance to the long-term persistence of gene duplications.
Sympatric speciation, the origin of two or more species from a single local population, has almost certainly been involved in formation of several species flocks, and may be fairly common in nature. The most straightforward scenario for sympatric speciation requires disruptive selection favouring two substantially different phenotypes, and consists of the evolution of reproductive isolation between them followed by the elimination of all intermediate phenotypes. Here we use the hypergeometric phenotypic model to show that sympatric speciation is possible even when fitness and mate choice depend on different quantitative traits, so that speciation must involve formation of covariance between these traits. The increase in the number of variable loci affecting fitness facilitates sympatric speciation, whereas the increase in the number of variable loci affecting mate choice has the opposite effect. These predictions may enable more cases of sympatric speciation to be identified.
Transcription is a slow and expensive process: in eukaryotes, approximately 20 nucleotides can be transcribed per second at the expense of at least two ATP molecules per nucleotide. Thus, at least for highly expressed genes, transcription of long introns, which are particularly common in mammals, is costly. Using data on the expression of genes that encode proteins in Caenorhabditis elegans and Homo sapiens, we show that introns in highly expressed genes are substantially shorter than those in genes that are expressed at low levels. This difference is greater in humans, such that introns are, on average, 14 times shorter in highly expressed genes than in genes with low expression, whereas in C. elegans the difference in intron length is only twofold. In contrast, the density of introns in a gene does not strongly depend on the level of gene expression. Thus, natural selection appears to favor short introns in highly expressed genes to minimize the cost of transcription and other molecular processes, such as splicing.
We study fitness landscape in the space of protein sequences by relating sets of human pathogenic missense mutations in 32 proteins to amino acid substitutions that occurred in the course of evolution of these proteins. On average, Ϸ10% of deviations of a nonhuman protein from its human ortholog are compensated pathogenic deviations (CPDs), i.e., are caused by an amino acid substitution that, at this site, would be pathogenic to humans. Normal functioning of a CPD-containing protein must be caused by other, compensatory deviations of the nonhuman species from humans. Together, a CPD and the corresponding compensatory deviation form a Dobzhansky-Muller incompatibility that can be visualized as the corner on a fitness ridge. Thus, proteins evolve along fitness ridges which contain only Ϸ10 steps between successive corners. The fraction of CPDs among all deviations of a protein from its human ortholog does not increase with the evolutionary distance between the proteins, indicating that substitutions that carry evolving proteins around these corners occur in rapid succession, driven by positive selection. Data on fitness of interspecies hybrids suggest that the compensatory change that makes a CPD fit usually occurs within the same protein. Data on protein structures and on cooccurrence of amino acids at different sites of multiple orthologous proteins often make it possible to provisionally identify the substitution that compensates a particular CPD.E volution unfolds on a fitness landscape, a map that relates fitness to the genotype (1). Obviously, most of possible genotypes are always unfit, and some of rare fit genotypes must be arranged in continuous ridges (networks). This general paradigm can be applied, inter alia, to the evolution of proteins. ''If evolution . . . is to occur, functional proteins must form a continuous network which can be traversed by unit mutational steps without passing through non-functional intermediates'' (2). However, data on fitness landscapes are limited, because only a tiny fraction of all possible genotypes is actually available, and inferring fitness of currently nonexisting genotypes is difficult (3-6). Relating data on human pathogenic missense mutations, which represent unfit genotypes, to interspecies differences between homologous proteins, all of which must be fit, offers a novel opportunity to probe the fitness landscape in the space of proteins.Human pathogenic amino acid substitutions tend to occur at less variable sites of proteins (7). Thus, an amino acid that in a nonhuman protein is different from the amino acid at the homologous site of the human ortholog would probably be benign for humans if placed into this site. Still, exceptions to this rule have been described (8, 9).For example, the 53rd site of human ␣-synuclein is normally occupied by Ala, and Ala 3 Thr substitution at this site predisposes to Parkinson's disease. Nevertheless, healthy mice (and rats) carry Thr at the homologous site of their ␣-synucleins (8). We call such a situation a compensated pathog...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.