Transposable elements (TE) are one of the major driving forces of genome evolution, raising the question of the long-term dynamics underlying their evolutionary success. Long-term TE evolution can readily be reconstructed in eukaryotes, thanks to many degraded copies constituting genomic fossil records of past TE proliferations. By contrast, bacterial genomes usually experience high sequence turnover and short TE retention times, thereby obscuring ancient TE evolutionary patterns. We found that Wolbachia bacterial genomes contain 52–171 insertion sequence (IS) TEs. IS account for 11% of Wolbachia wRi, which is one of the highest IS genomic coverage reported in prokaryotes to date. We show that many IS groups are currently expanding in various Wolbachia genomes and that IS horizontal transfers are frequent among strains, which can explain the apparent synchronicity of these IS proliferations. Remarkably, >70% of Wolbachia IS are nonfunctional. They constitute an unusual bacterial IS genomic fossil record providing direct empirical evidence for a long-term IS evolutionary dynamics following successive periods of intense transpositional activity. Our results show that comprehensive IS annotations have the potential to provide new insights into prokaryote TE evolution and, more generally, prokaryote genome evolution. Indeed, the identification of an important IS genomic fossil record in Wolbachia demonstrates that IS elements are not always of recent origin, contrary to the conventional view of TE evolution in prokaryote genomes. Our results also raise the question whether the abundance of IS fossils is specific to Wolbachia or it may be a general, albeit overlooked, feature of prokaryote genomes.
BackgroundNext-generation sequencing (NGS) technologies are arguably the most revolutionary technical development to join the list of tools available to molecular biologists since PCR. For researchers working with nonconventional model organisms one major problem with the currently dominant NGS platform (Illumina) stems from the obligatory fragmentation of nucleic acid material that occurs prior to sequencing during library preparation. This step creates a significant bioinformatic challenge for accurate de novo assembly of novel transcriptome data. This challenge becomes apparent when a variety of modern assembly tools (of which there is no shortage) are applied to the same raw NGS dataset. With the same assembly parameters these tools can generate markedly different assembly outputs.ResultsIn this study we present an approach that generates an optimized consensus de novo assembly of eukaryotic coding transcriptomes. This approach does not represent a new assembler, rather it combines the outputs of a variety of established assembly packages, and removes redundancy via a series of clustering steps. We test and validate our approach using Illumina datasets from six phylogenetically diverse eukaryotes (three metazoans, two plants and a yeast) and two simulated datasets derived from metazoan reference genome annotations. All of these datasets were assembled using three currently popular assembly packages (CLC, Trinity and IDBA-tran). In addition, we experimentally demonstrate that transcripts unique to one particular assembly package are likely to be bioinformatic artefacts. For all eight datasets our pipeline generates more concise transcriptomes that in fact possess more unique annotatable protein domains than any of the three individual assemblers we employed. Another measure of assembly completeness (using the purpose built BUSCO databases) also confirmed that our approach yields more information.ConclusionsOur approach yields coding transcriptome assemblies that are more likely to be closer to biological reality than any of the three individual assembly packages we investigated. This approach (freely available as a simple perl script) will be of use to researchers working with species for which there is little or no reference data against which the assembly of a transcriptome can be performed.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1406-x) contains supplementary material, which is available to authorized users.
Symbiotic interactions are widespread throughout the animal kingdom and are increasingly recognized as an important trait that can shape the evolution of a species. Sponges are widely understood to be the earliest branching clade of metazoans and often contain dense, diverse yet specific microbial communities which can constitute up to 50% of their biomass. These bacterial communities fulfill diverse functions influencing the sponge's physiology and ecology, and may have greatly contributed to the evolutionary success of the Porifera. Here we have analyzed and characterized the holo-transcriptome of the hypercalcifying demosponge Vaceletia sp. and compare it to other sponge transcriptomic and genomic data. Vaceletia sp. harbors a diverse and abundant microbial community; by identifying the underlying molecular mechanism of a variety of lipid pathway components we show that the sponge seems to rely on the supply of short chain fatty acids by its bacterial community. Comparisons to other sponges reveal that this dependency may be more pronounced in sponges with an abundant microbial community. Furthermore, the presence of bacterial polyketide synthase genes suggests bacteria are the producers of Vaceletia's abundant mid-chain branched fatty acids, whereas demospongic acids may be produced by the sponge host via elongation and desaturation of short-chain precursors. We show that the sponge and its microbial community have the molecular tools to interact through different mechanisms including the sponge's immune system, and the presence of eukaryotic-like proteins in bacteria. These results expand our knowledge of the complex gene repertoire of sponges and show the importance of metabolic interactions between sponges and their endobiotic microbial communities.
Despite the evolutionary success and ancient heritage of the molluscan shell, little is known about the molecular details of its formation, evolutionary origins, or the interactions between the material properties of the shell and its organic constituents. In contrast to this dearth of information, a growing collection of molluscan shell-forming proteomes and transcriptomes suggest they are comprised of both deeply conserved, and lineage specific elements. Analyses of these sequence data sets have suggested that mechanisms such as exon shuffling, gene co-option, and gene family expansion facilitated the rapid evolution of shell-forming proteomes and supported the diversification of this phylum specific structure. In order to further investigate and test these ideas we have examined the molecular features and spatial expression patterns of two shell-forming genes (Lustrin and ML1A2) and coupled these observations with materials properties measurements of shells from a group of closely related gastropods (abalone). We find that the prominent “GS” domain of Lustrin, a domain believed to confer elastomeric properties to the shell, varies significantly in length between the species we investigated. Furthermore, the spatial expression patterns of Lustrin and ML1A2 also vary significantly between species, suggesting that both protein architecture, and the regulation of spatial gene expression patterns, are important drivers of molluscan shell evolution. Variation in these molecular features might relate to certain materials properties of the shells of these species. These insights reveal an important and underappreciated source of variation within shell-forming proteomes that must contribute to the diversity of molluscan shell phenotypes.
BackgroundHorizontal transfer of transposable elements (HTT) is increasingly appreciated as an important source of genome and species evolution in eukaryotes. However, our understanding of HTT dynamics is still poor in eukaryotes because the diversity of species for which whole genome sequences are available is biased and does not reflect the global eukaryote diversity.ResultsIn this study we characterized two Mariner transposable elements (TEs) in the genome of several terrestrial crustacean isopods, a group of animals particularly underrepresented in genome databases. The two elements have a patchy distribution in the arthropod tree and they are highly similar (>93% over the entire length of the element) to insect TEs (Diptera and Hymenoptera), some of which were previously described in Ceratitis rosa (Crmar2) and Drosophila biarmipes (Mariner-5_Dbi). In addition, phylogenetic analyses and comparisons of TE versus orthologous gene distances at various phylogenetic levels revealed that the taxonomic distribution of the two elements is incompatible with vertical inheritance.ConclusionsWe conclude that the two Mariner TEs each underwent at least three HTT events. Both elements were transferred once between isopod crustaceans and insects and at least once between isopod crustacean species. Crmar2 was also transferred between tephritid and drosophilid flies and Mariner-5 underwent HT between hymenopterans and dipterans. We demonstrate that these various HTTs took place recently (most likely within the last 3 million years), and propose iridoviruses and/or Wolbachia endosymbionts as potential vectors of these transfers.
BackgroundThe shells of various Haliotis species have served as models of invertebrate biomineralization and physical shell properties for more than 20 years. A focus of this research has been the nacreous inner layer of the shell with its conspicuous arrangement of aragonite platelets, resembling in cross-section a brick-and-mortar wall. In comparison, the outer, less stable, calcitic prismatic layer has received much less attention. One of the first molluscan shell proteins to be characterized at the molecular level was Lustrin A, a component of the nacreous organic matrix of Haliotis rufescens. This was soon followed by the C-type lectin perlucin and the growth factor-binding perlustrin, both isolated from H. laevigata nacre, and the crystal growth-modulating AP7 and AP24, isolated from H. rufescens nacre. Mass spectrometry-based proteomics was subsequently applied to to Haliotis biomineralization research with the analysis of the H. asinina shell matrix and yielded 14 different shell-associated proteins. That study was the most comprehensive for a Haliotis species to date.MethodsThe shell proteomes of nacre and prismatic layer of the marine gastropod Haliotis laevigata were analyzed combining mass spectrometry-based proteomics and next generation sequencing.ResultsWe identified 297 proteins from the nacreous shell layer and 350 proteins from the prismatic shell layer from the green lip abalone H. laevigata. Considering the overlap between the two sets we identified a total of 448 proteins. Fifty-one nacre proteins and 43 prismatic layer proteins were defined as major proteins based on their abundance at more than 0.2% of the total. The remaining proteins occurred at low abundance and may not play any significant role in shell fabrication. The overlap of major proteins between the two shell layers was 17, amounting to a total of 77 major proteins.ConclusionsThe H. laevigata shell proteome shares moderate sequence similarity at the protein level with other gastropod, bivalve and more distantly related invertebrate biomineralising proteomes. Features conserved in H. laevigata and other molluscan shell proteomes include short repetitive sequences of low complexity predicted to lack intrinsic three-dimensional structure, and domains such as tyrosinase, chitin-binding, and carbonic anhydrase. This catalogue of H. laevigata shell proteins represents the most comprehensive for a haliotid and should support future efforts to elucidate the molecular mechanisms of shell assembly.Electronic supplementary materialThe online version of this article (10.1186/s12953-018-0139-3) contains supplementary material, which is available to authorized users.
Identifying and understanding mechanisms that generate phenotypic diversity is a fundamental goal of evolutionary biology. With a diversity of pigmented shell morphotypes governed by Mendelian patterns of inheritance, the common grove snail Cepaea nemoralis (Linnaeus, 1758) has been a model for evolutionary biologists and population geneticists for decades. However, the genetic mechanisms by which C. nemoralis generates this pigmented shell diversity remain unknown. An important first step in investigating this pigmentation pattern is to establish a set of validated reference genes for differential gene expression assays. Here we have evaluated eleven candidate genes for reverse transcription quantitative polymerase chain reaction (qPCR) in C. nemoralis. Five of these were housekeeping genes traditionally employed as qPCR reference genes in other species, while six alternative genes were selected de novo from C. nemoralis transcriptome data based on the stability of their expression levels. We tested all eleven candidates for expression stability in four sub-adult tissues of C. nemoralis: pigmented mantle, unpigmented mantle, head and foot. We find that two commonly employed housekeeping genes (alpha tubulin, glyceraldehyde 3-phosphate dehydrogenase) are unsuitable for use as qPCR reference genes in C. nemoralis. The traditional housekeeping gene UBIquitin on the other hand performed very well. Additionally, an RNA-directed DNA polymerase (RNAP), a Potassium Channel Protein (KCHP) and a Prenylated Rab acceptor protein 1 (PRAP), identified de novo from transcriptomic data, were the most stably expressed genes in different tissue combinations. We also tested expression stability over two seasons and found that, although other genes are more stable within a single season, beta actin (BACT) and elongation factor 1 alpha (EF1α) were the most reliable reference genes across seasons.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.