The duplication of entire genomes has long been recognized as having great potential for evolutionary novelties, but the mechanisms underlying their resolution through gene loss are poorly understood. Here we show that in the unicellular eukaryote Paramecium tetraurelia, a ciliate, most of the nearly 40,000 genes arose through at least three successive whole-genome duplications. Phylogenetic analysis indicates that the most recent duplication coincides with an explosion of speciation events that gave rise to the P. aurelia complex of 15 sibling species. We observed that gene loss occurs over a long timescale, not as an initial massive event. Genes from the same metabolic pathway or protein complex have common patterns of gene loss, and highly expressed genes are over-retained after all duplications. The conclusion of this analysis is that many genes are maintained after whole-genome duplication not because of functional innovation but because of gene dosage constraints.Ciliates are unique among unicellular organisms in that they separate germline and somatic functions 1 . Each cell harbours two kinds of nucleus, namely silent diploid micronuclei and highly polyploid macronuclei. The latter are unusual in that they contain an extensively rearranged genome streamlined for expression and divide by a non-mitotic process. Only micronuclei undergo meiosis to perpetuate genetic information; the macronuclei are lost at each sexual generation and develop anew from the micronuclear lineage.In Paramecium the exact number of micronuclear chromosomes (more than 50) and the structures of their centromeres and telomeres remain unknown. During macronuclear development, these chromosomes are amplified to about 800 copies and undergo two types of DNA elimination event. Tens of thousand of short, unique copy elements (internal eliminated sequences) are removed by a precise mechanism that leads to the reconstitution of functional genes 2 .Transposable elements and other repeated sequences are removed by an imprecise mechanism leading either to chromosome fragmentation and de novo telomere addition or to variable internal deletions 3 . These rearrangements occur after a few rounds of endoreplication, leading to some heterogeneity in the sequences abutting the imprecisely eliminated regions 3 . The sizes of the resulting, acentric macronuclear chromosomes range from 50-1,000 kilobases (kb) as measured by pulsed-field gel electrophoresis. Because the sexual process of autogamy results in an entirely homozygous genotype 4 , the macronuclear DNA that was sequenced was genetically homogeneous.The Paramecium genome sequence The Paramecium macronuclear genome sequence was established with the use of a whole-genome shotgun and assembly strategy. Paired-end sequencing of plasmid and bacterial artificial chromosome (BAC) clones provided a coverage of 13 genome equivalents (Supplementary Table S1). We assembled the sequence reads with Arachne 5 in 1,907 contigs connected in 697 scaffolds of size greater than 2 kb, giving a total coverage of 72...
The BioMart Community Portal (www.biomart.org) is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biological datasets spanning genomics, proteomics, model organisms, cancer data, ontology information and more. All resources available through the portal are independently administered and funded by their host organizations. The BioMart data federation technology provides a unified interface to all the available data. The latest version of the portal comes with many new databases that have been created by our ever-growing community. It also comes with better support and extensibility for data analysis and visualization tools. A new addition to our toolbox, the enrichment analysis tool is now accessible through graphical and web service interface. The BioMart community portal averages over one million requests per day. Building on this level of service and the wealth of information that has become available, the BioMart Community Portal has introduced a new, more scalable and cheaper alternative to the large data stores maintained by specialized organizations.
Insertions of parasitic DNA within coding sequences are usually deleterious and are generally counter-selected during evolution. Thanks to nuclear dimorphism, ciliates provide unique models to study the fate of such insertions. Their germline genome undergoes extensive rearrangements during development of a new somatic macronucleus from the germline micronucleus following sexual events. In Paramecium, these rearrangements include precise excision of unique-copy Internal Eliminated Sequences (IES) from the somatic DNA, requiring the activity of a domesticated piggyBac transposase, PiggyMac. We have sequenced Paramecium tetraurelia germline DNA, establishing a genome-wide catalogue of ∼45,000 IESs, in order to gain insight into their evolutionary origin and excision mechanism. We obtained direct evidence that PiggyMac is required for excision of all IESs. Homology with known P. tetraurelia Tc1/mariner transposons, described here, indicates that at least a fraction of IESs derive from these elements. Most IES insertions occurred before a recent whole-genome duplication that preceded diversification of the P. aurelia species complex, but IES invasion of the Paramecium genome appears to be an ongoing process. Once inserted, IESs decay rapidly by accumulation of deletions and point substitutions. Over 90% of the IESs are shorter than 150 bp and present a remarkable size distribution with a ∼10 bp periodicity, corresponding to the helical repeat of double-stranded DNA and suggesting DNA loop formation during assembly of a transpososome-like excision complex. IESs are equally frequent within and between coding sequences; however, excision is not 100% efficient and there is selective pressure against IES insertions, in particular within highly expressed genes. We discuss the possibility that ancient domestication of a piggyBac transposase favored subsequent propagation of transposons throughout the germline by allowing insertions in coding sequences, a fraction of the genome in which parasitic DNA is not usually tolerated.
In the ciliate Paramecium tetraurelia, differentiation of the somatic nucleus from the zygotic nucleus is characterized by massive and reproducible deletion of transposable elements and of 45,000 short, dispersed, single-copy sequences. A specific class of small RNAs produced by the germline during meiosis, the scnRNAs, are involved in the epigenetic regulation of DNA deletion but the underlying mechanisms are poorly understood. Here, we show that trimethylation of histone H3 (H3K27me3 and H3K9me3) displays a dynamic nuclear localization that is altered when the endonuclease required for DNA elimination is depleted. We identified the putative histone methyltransferase Ezl1 necessary for H3K27me3 and H3K9me3 establishment and show that it is required for correct genome rearrangements. Genome-wide analyses show that scnRNA-mediated H3 trimethylation is necessary for the elimination of long, repeated germline DNA, while single copy sequences display differential sensitivity to depletion of proteins involved in the scnRNA pathway, Ezl1- a putative histone methyltransferase and Dcl5- a protein required for iesRNA biogenesis. Our study reveals cis-acting determinants, such as DNA length, also contribute to the definition of germline sequences to delete. We further show that precise excision of single copy DNA elements, as short as 26 bp, requires Ezl1, suggesting that development specific H3K27me3 and H3K9me3 ensure specific demarcation of very short germline sequences from the adjacent somatic sequences.
Most eukaryotic genes are interrupted by non-coding introns that must be accurately removed from pre-messenger RNAs to produce translatable mRNAs. Splicing is guided locally by short conserved sequences, but genes typically contain many potential splice sites, and the mechanisms specifying the correct sites remain poorly understood. In most organisms, short introns recognized by the intron definition mechanism cannot be efficiently predicted solely on the basis of sequence motifs. In multicellular eukaryotes, long introns are recognized through exon definition and most genes produce multiple mRNA variants through alternative splicing. The nonsense-mediated mRNA decay (NMD) pathway may further shape the observed sets of variants by selectively degrading those containing premature termination codons, which are frequently produced in mammals. Here we show that the tiny introns of the ciliate Paramecium tetraurelia are under strong selective pressure to cause premature termination of mRNA translation in the event of intron retention, and that the same bias is observed among the short introns of plants, fungi and animals. By knocking down the two P. tetraurelia genes encoding UPF1, a protein that is crucial in NMD, we show that the intrinsic efficiency of splicing varies widely among introns and that NMD activity can significantly reduce the fraction of unspliced mRNAs. The results suggest that, independently of alternative splicing, species with large intron numbers universally rely on NMD to compensate for suboptimal splicing efficiency and accuracy.
In animals and plants, the H3K9me3 and H3K27me3 chromatin silencing marks are deposited by different protein machineries. H3K9me3 is catalyzed by the SET-domain SU(VAR)3–9 enzymes, while H3K27me3 is catalyzed by the SET-domain Enhancer-of-zeste enzymes, which are the catalytic subunits of Polycomb Repressive Complex 2 (PRC2). Here, we show that the Enhancer-of-zeste-like protein Ezl1 from the unicellular eukaryote Paramecium tetraurelia , which exhibits significant sequence and structural similarities with human EZH2, catalyzes methylation of histone H3 in vitro and in vivo with an apparent specificity toward K9 and K27. We find that H3K9me3 and H3K27me3 co-occur at multiple families of transposable elements in an Ezl1-dependent manner. We demonstrate that loss of these histone marks results in global transcriptional hyperactivation of transposable elements with modest effects on protein-coding gene expression. Our study suggests that although often considered functionally distinct, H3K9me3 and H3K27me3 may share a common evolutionary history as well as a common ancestral role in silencing transposable elements.
BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities.Database URL: http://central.biomart.org.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.