This article introduces the second release of the Gypsy Database of Mobile Genetic Elements (GyDB 2.0): a research project devoted to the evolutionary dynamics of viruses and transposable elements based on their phylogenetic classification (per lineage and protein domain). The Gypsy Database (GyDB) is a long-term project that is continuously progressing, and that owing to the high molecular diversity of mobile elements requires to be completed in several stages. GyDB 2.0 has been powered with a wiki to allow other researchers participate in the project. The current database stage and scope are long terminal repeats (LTR) retroelements and relatives. GyDB 2.0 is an update based on the analysis of Ty3/Gypsy, Retroviridae, Ty1/Copia and Bel/Pao LTR retroelements and the Caulimoviridae pararetroviruses of plants. Among other features, in terms of the aforementioned topics, this update adds: (i) a variety of descriptions and reviews distributed in multiple web pages; (ii) protein-based phylogenies, where phylogenetic levels are assigned to distinct classified elements; (iii) a collection of multiple alignments, lineage-specific hidden Markov models and consensus sequences, called GyDB collection; (iv) updated RefSeq databases and BLAST and HMM servers to facilitate sequence characterization of new LTR retroelement and caulimovirus queries; and (v) a bibliographic server. GyDB 2.0 is available at http://gydb.org.
SignificanceWe sequenced the genome and transcriptomes of the wild olive (oleaster). More than 50,000 genes were predicted, and evidence was found for two relatively recent whole-genome duplication events, dated at about 28 and 59 million years ago. Whole genome sequencing, as well as gene expression studies, provide further insights into the evolution of oil biosynthesis, and will aid future studies aimed at further increasing the production of olive oil, which is a key ingredient of the healthy Mediterranean diet and has been granted a qualified health claim by FDA. 5 AbstractHere, we present the genome sequence and annotation of the wild olive tree (Olea europaea var. sylvestris), called oleaster, which is considered an ancestor of cultivated olive trees. More than 50,000 protein-coding genes were predicted, a majority of which could be anchored to 23 pseudo-chromosomes obtained through a newly constructed genetic map. The oleaster genome contains signatures of two Oleaceae-lineage specific paleopolyploidy events, dated at approximately 28 and 59 million years ago. These events contributed to the expansion and neofunctionalization of genes and gene families that play important roles in oil biosynthesis.The functional divergence of oil biosynthesis pathway genes, such as FAD2, SACPD, EAR and ACPTE, following duplication, has been responsible for the differential accumulation of oleic and linoleic acids produced in olive compared to sesame, a closely related oil crop. Duplicated oleaster FAD2 genes are regulated by a short-interfering RNA (siRNA) derived from a transposable element-rich region, leading to suppressed levels of FAD2 gene expression.Additionally, neofunctionalization of members of the SACPD gene family has led to increased expression of SACPD2, 3, 5 and 7, consequently resulting in an increased desaturation of steric acid. Taken together, decreased FAD2 expression and increased SACPD expression likely explain the accumulation of exceptionally high levels of oleic acid in olive. The oleaster genome thus provides important insights into the evolution of oil biosynthesis and will be a valuable resource for oil crop genomics. 6 /bodyAs a symbol of peace, fertility, health and longevity, the olive tree (Olea europaea L.) is a socio-economically important oil crop that is widely grown in the Mediterranean Basin.Belonging to the Oleaceae family (order Lamiales), it can biosynthesize essential unsaturated fatty acids and other important secondary metabolites, such as vitamins and phenolic compounds (1). The olive tree is a diploid (2n = 46) allogamous crop that can be vegetatively propagated and live for thousands of years (2). Paleobotanical evidence suggests that olive oil was already produced in the Bronze Age (3). It has been thought that cultivated varieties were derived from the wild olive tree, called oleaster (O. europaea var. sylvestris), in Asia Minor, which then spread to Greece (4). Nevertheless, the exact domestication history of the olive tree is unknown (5). Due to their longevity, oleaster...
Background: Sequencing projects have allowed diverse retroviruses and LTR retrotransposons from different eukaryotic organisms to be characterized. It is known that retroviruses and other retro-transcribing viruses evolve from LTR retrotransposons and that this whole system clusters into five families: Ty3/Gypsy, Retroviridae, Ty1/Copia, Bel/Pao and Caulimoviridae. Phylogenetic analyses usually show that these split into multiple distinct lineages but what is yet to be understood is how deep evolution occurred in this system.
We performed a comprehensive analysis of the evolution of the Ty3/GYPSY: group of long-terminal-repeat retrotransposons (also known as METAVIRIDAE:). Exhaustive database searches allowed us to detect novel elements of this group. In particular, the Arabidopsis thaliana and Drosophila melanogaster genome sequencing projects have recently disclosed a large number of new Ty3/GYPSY: sequences. So far, elements of three different Ty3/GYPSY: lineages had been described for A. thaliana. Here, we describe six new lineages, which we have called Tit-for-tat1, Tit-for-tat2, Gimli, Gloin, Legolas, and Little Athila. We confirm that plant Ty3/GYPSY: elements form two main monophyletic groups. Moreover, our results suggest that at least four independent ancestral lineages existed before the monocot-dicot split, about 200 MYA. Twelve sequences from D. melanogaster that may correspond to new elements are also described. Some of these sequences are similar to those of OSVALDO: and Ulysses, two elements of the OSVALDO: clade that had never before been described for D. melanogaster. Comparative analyses of multiple organisms, some of them with completely sequenced genomes, show that the number of lineages of Ty3/GYPSY: elements is very variable. Thus, while only 1 lineage is present in Saccharomyces cerevisiae, at least 6 exist in Caenorhabditis elegans, at least 9 are present in the A. thaliana, and perhaps 20 are present in D. melanogaster. Finally, we suggest that the presence of a chromodomain-containing integrase, a feature of some closely related Ty3/GYPSY: elements of fungi, plants, and animals, may be used to define a new METAVIRIDAE: genus.
Vibrio vulnificus (Vv) is a multi-host pathogenic species currently subdivided into three biotypes (Bts). The three Bts are human-pathogens, but only Bt2 is also a fish-pathogen, an ability that is conferred by a transferable virulence-plasmid (pVvbt2). Here we present a phylogenomic analysis from the core genome of 80 Vv strains belonging to the three Bts recovered from a wide range of geographical and ecological sources. We have identified five well-supported phylogenetic groups or lineages (L). L1 comprises a mixture of clinical and environmental Bt1 strains, most of them involved in human clinical cases related to raw seafood ingestion. L2 is formed by a mixture of Bt1 and Bt2 strains from various sources, including diseased fish, and is related to the aquaculture industry. L3 is also linked to the aquaculture industry and includes Bt3 strains exclusively, mostly related to wound infections or secondary septicemia after farmed-fish handling. Lastly, L4 and L5 include a few strains of Bt1 associated with specific geographical areas. The phylogenetic trees for ChrI and II are not congruent to one another, which suggests that inter- and/or intra-chromosomal rearrangements have been produced along Vv evolution. Further, the phylogenetic trees for each chromosome and the virulence plasmid were also not congruent, which also suggests that pVvbt2 has been acquired independently by different clones, probably in fish farms. From all these clones, the one with zoonotic capabilities (Bt2-Serovar E) has successfully spread worldwide. Based on these results, we propose a new updated classification of the species based on phylogenetic lineages rather than on Bts, as well as the inclusion of all Bt2 strains in a pathovar with the particular ability to cause fish vibriosis, for which we suggest the name “piscis.”
Fusarium avenaceum is a fungus commonly isolated from soil and associated with a wide range of host plants. We present here three genome sequences of F. avenaceum, one isolated from barley in Finland and two from spring and winter wheat in Canada. The sizes of the three genomes range from 41.6–43.1 MB, with 13217–13445 predicted protein-coding genes. Whole-genome analysis showed that the three genomes are highly syntenic, and share>95% gene orthologs. Comparative analysis to other sequenced Fusaria shows that F. avenaceum has a very large potential for producing secondary metabolites, with between 75 and 80 key enzymes belonging to the polyketide, non-ribosomal peptide, terpene, alkaloid and indole-diterpene synthase classes. In addition to known metabolites from F. avenaceum, fuscofusarin and JM-47 were detected for the first time in this species. Many protein families are expanded in F. avenaceum, such as transcription factors, and proteins involved in redox reactions and signal transduction, suggesting evolutionary adaptation to a diverse and cosmopolitan ecology. We found that 20% of all predicted proteins were considered to be secreted, supporting a life in the extracellular space during interaction with plant hosts.
BackgroundThe origin of vertebrate retroviruses (Retroviridae) is yet to be thoroughly investigated, but due to their similarity and identical gag-pol (and env) genome structure, it is accepted that they evolve from Ty3/Gypsy LTR retroelements the retrotransposons and retroviruses of plants, fungi and animals. These 2 groups of LTR retroelements code for 3 proteins rarely studied due to the high variability – gag polyprotein, protease and GPY/F module. In relation to 3 previously proposed Retroviridae classes I, II and II, investigation of the above proteins conclusively uncovers important insights regarding the ancient history of Ty3/Gypsy and Retroviridae LTR retroelements.ResultsWe performed a comprehensive study of 120 non-redundant Ty3/Gypsy and Retroviridae LTR retroelements. Phylogenetic reconstruction inferred based on the concatenated analysis of the gag and pol polyproteins shows a robust phylogenetic signal regarding the clustering of OTUs. Evaluation of gag and pol polyproteins separately yields discordant information. While pol signal supports the traditional perspective (2 monophyletic groups), gag polyprotein describes an alternative scenario where each Retroviridae class can be distantly related with one or more Ty3/Gypsy lineages. We investigated more in depth this evidence through comparative analyses performed based on the gag polyprotein, the protease and the GPY/F module. Our results indicate that contrary to the traditional monophyletic view of the origin of vertebrate retroviruses, the Retroviridae class I is a molecular fossil, preserving features that were probably predominant among Ty3/Gypsy ancestors predating the split of plants, fungi and animals. In contrast, classes II and III maintain other phenotypes that emerged more recently during Ty3/Gypsy evolution.ConclusionThe 3 Retroviridae classes I, II and III exhibit phenotypic differences that delineate a network never before reported between Ty3/Gypsy and Retroviridae LTR retroelements. This new scenario reveals how the diversity of vertebrate retroviruses is polyphyletically recurrent into the Ty3/Gypsy evolution, i.e. older than previously thought. The simplest hypothesis to explain this finding is that classes I, II and III trace back to at least 3 Ty3/Gypsy ancestors that emerged at different evolutionary times prior to protostomes-deuterostomes divergence. We have called this "the three kings hypothesis" concerning the origin of vertebrate retroviruses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.