Genomic comparisons provide evidence for ancient genome-wide duplications in a diverse array of animals and plants. We developed a birth-death model to identify evidence for genome duplication in EST data, and applied a mixture model to estimate the age distribution of paralogous pairs identified in EST sets for species representing the basal-most extant flowering plant lineages. We found evidence for episodes of ancient genome-wide duplications in the basal angiosperm lineages including Nuphar advena (yellow water lily: Nymphaeaceae) and the magnoliids Persea americana (avocado: Lauraceae), Liriodendron tulipifera (tulip poplar: Magnoliaceae), and Saruma henryi (Aristolochiaceae). In addition, we detected independent genome duplications in the basal eudicot Eschscholzia californica (California poppy: Papaveraceae) and the basal monocot Acorus americanus (Acoraceae), both of which were distinct from duplications documented for ancestral grass (Poaceae) and core eudicot lineages. Among gymnosperms, we found equivocal evidence for ancient polyploidy in Welwitschia mirabilis (Gnetales) and no evidence for polyploidy in pine, although gymnosperms generally have much larger genomes than the angiosperms investigated. Cross-species sequence divergence estimates suggest that synonymous substitution rates in the basal angiosperms are less than half those previously reported for core eudicots and members of Poaceae. These lower substitution rates permit inference of older duplication events. We hypothesize that evidence of an ancient duplication observed in the Nuphar data may represent a genome duplication in the common ancestor of all or most extant angiosperms, except Amborella.
During the past decade, there has been a rapid increase in our understanding of plastid genome organization and evolution due to the availability of many new completely sequenced genomes. There are 45 complete genomes published and ongoing projects are likely to increase this sampling to nearly 200 genomes during the next 5 years. Several groups of researchers including ours have been developing new techniques for gathering and analyzing entire plastid genome sequences and details of these developments are summarized in this chapter. The most important developments that enhance our ability to generate whole chloroplast genome sequences involve the generation of pure fractions of chloroplast genomes by whole genome amplification using rolling circle amplification, cloning genomes into Fosmid or bacterial artificial chromosome (BAC) vectors, and the development of an organellar annotation program (Dual Organellar GenoMe Annotator [DOGMA]). In addition to providing details of these methods, we provide an overview of methods for analyzing complete plastid genome sequences for repeats and gene content, as well as approaches for using gene order and sequence data for phylogeny reconstruction. This explosive increase in the number of sequenced plastid genomes and improved computational tools will provide many insights into the evolution of these genomes and much new data for assessing relationships at deep nodes in plants and other photosynthetic organisms.
Gene duplication plays an important role in the evolution of diversity and novel function and is especially prevalent in the nuclear genomes of flowering plants. Duplicate genes may be maintained through subfunctionalization and neofunctionalization at the level of expression or coding sequence. In order to test the hypothesis that duplicated regulatory genes will be differentially expressed in a specific manner indicative of regulatory subfunctionalization and/or neofunctionalization, we examined expression pattern shifts in duplicated regulatory genes in Arabidopsis. A two-way analysis of variance was performed on expression data for 280 phylogenetically identified paralogous pairs. Expression data were extracted from global expression profiles for wild-type root, stem, leaf, developing inflorescence, nearly mature flower buds, and seedpod. Gene, organ, and gene by organ interaction (G x O) effects were examined. Results indicate that 85% of the paralogous pairs exhibited a significant G x O effect indicative of regulatory subfunctionalization and/or neofunctionalization. A significant G x O effect was associated with complementary expression patterns in 45% of pairwise comparisons. No association was detected between a G x O effect and a relaxed evolutionary constraint as detected by the ratio of nonsynonymous to synonymous substitutions. Ancestral gene expression patterns inferred across a Type II MADS-box gene phylogeny suggest several cases of regulatory neofunctionalization and organ-specific nonfunctionalization. Complete linkage clustering of gene expression levels across organs suggests that regulatory modules for each organ are independent or ancestral genes had limited expression. We propose a new classification, regulatory hypofunctionalization, for an overall decrease in expression level in one member of a paralogous pair while still having a significant G x O effect. We conclude that expression divergence specifically indicative of subfunctionalization and/or neofunctionalization contributes to the maintenance of most if not all duplicated regulatory genes in Arabidopsis and hypothesize that this results in increasing expression diversity or specificity of regulatory genes after each round of duplication.
While there has been strong support for Amborella and Nymphaeales (water lilies) as branching from basal-most nodes in the angiosperm phylogeny, this hypothesis has recently been challenged by phylogenetic analyses of 61 protein-coding genes extracted from the chloroplast genome sequences of Amborella, Nymphaea, and 12 other available land plant chloroplast genomes. These character-rich analyses placed the monocots, represented by three grasses (Poaceae), as sister to all other extant angiosperm lineages. We have extracted protein-coding regions from draft sequences for six additional chloroplast genomes to test whether this surprising result could be an artifact of long-branch attraction due to limited taxon sampling. The added taxa include three monocots (Acorus, Yucca, and Typha), a water lily (Nuphar), a ranunculid (Ranunculus), and a gymnosperm (Ginkgo). Phylogenetic analyses of the expanded DNA and protein data sets together with microstructural characters (indels) provided unambiguous support for Amborella and the Nymphaeales as branching from the basal-most nodes in the angiosperm phylogeny. However, their relative positions proved to be dependent on the method of analysis, with parsimony favoring Amborella as sister to all other angiosperms and maximum likelihood (ML) and neighbor-joining methods favoring an Amborella + Nymphaeales clade as sister. The ML phylogeny supported the later hypothesis, but the likelihood for the former hypothesis was not significantly different. Parametric bootstrap analysis, single-gene phylogenies, estimated divergence dates, and conflicting indel characters all help to illuminate the nature of the conflict in resolution of the most basal nodes in the angiosperm phylogeny. Molecular dating analyses provided median age estimates of 161 MYA for the most recent common ancestor (MRCA) of all extant angiosperms and 145 MYA for the MRCA of monocots, magnoliids, and eudicots. Whereas long sequences reduce variance in branch lengths and molecular dating estimates, the impact of improved taxon sampling on the rooting of the angiosperm phylogeny together with the results of parametric bootstrap analyses demonstrate how long-branch attraction might mislead genome-scale phylogenetic analyses.
Chlamydomonas reinhardtii is a unicellular eukaryotic alga possessing a single chloroplast that is widely used as a model system for the study of photosynthetic processes. This report analyzes the surprising structural and evolutionary features of the completely sequenced 203,395-bp plastid chromosome. The genome is divided by 21.2-kb inverted repeats into two single-copy regions of approximately 80 kb and contains only 99 genes, including a full complement of tRNAs and atypical genes encoding the RNA polymerase. A remarkable feature is that >20% of the genome is repetitive DNA: the majority of intergenic regions consist of numerous classes of short dispersed repeats (SDRs), which may have structural or evolutionary significance. Among other sequenced chlorophyte plastid genomes, only that of the green alga Chlorella vulgaris appears to share this feature. The program MultiPipMaker was used to compare the genic complement of Chlamydomonas with those of other chloroplast genomes and to scan the genomes for sequence similarities and repetitive DNAs. Among the results was evidence that the SDRs were not derived from extant coding sequences, although some SDRs may have arisen from other genomic fragments. Phylogenetic reconstruction of changes in plastid genome content revealed that an accelerated rate of gene loss also characterized the Chlamydomonas/Chlorella lineage, a phenomenon that might be independent of the proliferation of SDRs. Together, our results reveal a dynamic and unusual plastid genome whose existence in a model organism will allow its features to be tested functionally.
We identify and quantify two types of EST clustering error, namely, Type I and II in EST clustering using CAP3 assembling program. A Type I error occurs when ESTs from the same gene do not form a cluster whereas a Type II error occurs when ESTs from distinct genes are falsely clustered together. While the Type II error rate is <1.5% for both 5' and 3' EST clustering, the Type I error in the 5' EST case is approximately 10 times higher than the 3' EST case (30% versus 3%). An over-stringent identity rule, e.g., P >/= 95%, may even inflate the Type I error in both cases. We demonstrate that approximately 80% of the Type I error is due to insufficient overlap among sibling ESTs (ISO error) in 5' EST clustering. A novel statistical approach is proposed to correct ISO error to provide more accurate estimates of the true gene cluster profile.
The Chloroplast Genome Database (ChloroplastDB) is an interactive, web-based database for fully sequenced plastid genomes, containing genomic, protein, DNA and RNA sequences, gene locations, RNA-editing sites, putative protein families and alignments (). With recent technical advances, the rate of generating new organelle genomes has increased dramatically. However, the established ontology for chloroplast genes and gene features has not been uniformly applied to all chloroplast genomes available in the sequence databases. For example, annotations for some published genome sequences have not evolved with gene naming conventions. ChloroplastDB provides unified annotations, gene name search, BLAST and download functions for chloroplast encoded genes and genomic sequences. A user can retrieve all orthologous sequences with one search regardless of gene names in GenBank. This feature alone greatly facilitates comparative research on sequence evolution including changes in gene content, codon usage, gene structure and post-transcriptional modifications such as RNA editing. Orthologous protein sets are classified by TribeMCL and each set is assigned a standard gene name. Over the next few years, as the number of sequenced chloroplast genomes increases rapidly, the tools available in ChloroplastDB will allow researchers to easily identify and compile target data for comparative analysis of chloroplast genes and genomes.
Purpose: Ischemic vascular diseases, including myocardial infarction (MI) and stroke, have been found to be associated with elevated expression of αvβ3-integrin, which provides a promising target for semi-quantitative monitoring of the disease. For the first time, we employed 68Ga-S-2-(isothiocyanatobenzyl)-1,4,7-triazacyclononane-1,4,7-triacetic acid-PEG3-E[c(RGDyK)]2 (68Ga-PRGD2) to evaluate the αvβ3-integrin-related repair in post-MI and post-stroke patients via positron emission tomography/computed tomography (PET/CT).Methods: With Institutional Review Board approval, 23 MI patients (3 days-2 years post-MI) and 16 stroke patients (3 days-13 years post-stroke) were recruited. After giving informed consent, each patient underwent a cardiac or brain PET/CT scan 30 min after the intravenous injection of 68Ga-PRGD2 in a dose of approximately 1.85 MBq (0.05 mCi) per kilogram body weight. Two stroke patients underwent repeat scans three months after the event.Results: Patchy 68Ga-PRGD2 uptake occurred in or around the ischemic regions in 20/23 MI patients and punctate multifocal uptake occurred in 8/16 stroke patients. The peak standardized uptake values (pSUVs) in MI were 1.94 ± 0.48 (mean ± SD; range, 0.62-2.69), significantly higher than those in stroke (mean ± SD, 0.46 ± 0.29; range, 0.15-0.93; P < 0.001). Higher 68Ga-PRGD2 uptake was observed in the patients 1-3 weeks after the initial onset of the MI/stroke event. The uptake levels were significantly correlated with the diameter of the diseases (r = 0.748, P = 0.001 for MI and r = 0.835, P = 0.003 for stroke). Smaller or older lesions displayed no uptake.Conclusions: 68Ga-PRGD2 uptake was observed around the ischemic region in both MI and stroke patients, which was correlated with the disease phase and severity. The different image patterns and uptake levels in MI and stroke patients warrant further investigations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.