SummaryArabidopsis thaliana serves as a model organism for the study of fundamental physiological, cellular, and molecular processes. It has also greatly advanced our understanding of intraspecific genome variation. We present a detailed map of variation in 1,135 high-quality re-sequenced natural inbred lines representing the native Eurasian and North African range and recently colonized North America. We identify relict populations that continue to inhabit ancestral habitats, primarily in the Iberian Peninsula. They have mixed with a lineage that has spread to northern latitudes from an unknown glacial refugium and is now found in a much broader spectrum of habitats. Insights into the history of the species and the fine-scale distribution of genetic diversity provide the basis for full exploitation of A. thaliana natural variation through integration of genomes and epigenomes with molecular and non-molecular phenotypes.
SUMMARY The epigenome orchestrates genome accessibility, functionality and three-dimensional structure. Because epigenetic variation can impact transcription and thus phenotypes, it may contribute to adaptation. Here we report 1,107 high-quality single-base resolution methylomes and 1,203 transcriptomes from the 1001 Genomes collection of Arabidopsis thaliana. Although the genetic basis of methylation variation is highly complex, geographic origin is a major predictor of genome-wide DNA methylation levels and of altered gene expression caused by epialleles. Comparison to cistrome and epicistrome datasets identifies associations between transcription factor binding sites, methylation, nucleotide variation and co-expression modules. Physical maps for nine of the most diverse genomes reveals how transposons and other structural variants shape the epigenome, with dramatic effects on immunity genes. The 1001 Epigenomes Project provides a comprehensive resource for understanding how variation in DNA methylation contributes to molecular and non-molecular phenotypes in natural populations of the most studied model plant.
Despite advances in sequencing, the goal of obtaining a comprehensive view of genetic variation in populations is still far from reached. We sequenced 180 lines of A. thaliana from Sweden to obtain as complete a picture as possible of variation in a single region. Whereas simple polymorphisms in the unique portion of the genome are readily identified, other polymorphisms are not. The massive variation in genome size identified by flow cytometry seems largely to be due to 45S rDNA copy number variation, with lines from northern Sweden having particularly large numbers of copies. Strong selection is evident in the form of long-range linkage disequilibrium (LD), as well as in LD between nearby compensatory mutations. Many footprints of selective sweeps were found in lines from northern Sweden, and a massive global sweep was shown to have involved a 700-kb transposition.
Protein secretion systems play a key role in the interaction of bacteria and hosts. EffectiveDB (http://effectivedb.org) contains pre-calculated predictions of bacterial secreted proteins and of intact secretion systems. Here we describe a major update of the database, which was previously featured in the NAR Database Issue. EffectiveDB bundles various tools to recognize Type III secretion signals, conserved binding sites of Type III chaperones, Type IV secretion peptides, eukaryotic-like domains and subcellular targeting signals in the host. Beyond the analysis of arbitrary protein sequence collections, the new release of EffectiveDB also provides a ‘genome-mode’, in which protein sequences from nearly complete genomes or metagenomic bins can be screened for the presence of three important secretion systems (Type III, IV, VI). EffectiveDB contains pre-calculated predictions for currently 1677 bacterial genomes from the EggNOG 4.0 database and for additional bacterial genomes from NCBI RefSeq. The new, user-friendly and informative web portal offers a submission tool for running the EffectiveDB prediction tools on user-provided data.
In plants, gametogenesis occurs late in development, and somatic mutations can therefore be transmitted to the next generation. Longer periods of growth are believed to result in an increase in the number of cell divisions before gametogenesis, with a concomitant increase in mutations arising due to replication errors. However, there is little experimental evidence addressing how many cell divisions occur before gametogenesis. Here, we measured loss of telomeric DNA and accumulation of replication errors in Arabidopsis with short and long life spans to determine the number of replications in lineages leading to gametes. Surprisingly, the number of cell divisions within the gamete lineage is nearly independent of both life span and vegetative growth. One consequence of the relatively stable number of replications per generation is that older plants may not pass along more somatically acquired mutations to their offspring. We confirmed this hypothesis by genomic sequencing of progeny from young and old plants. This independence can be achieved by hierarchical arrangement of cell divisions in plant meristems where vegetative growth is primarily accomplished by expansion of cells in rapidly dividing meristematic zones, which are only rarely refreshed by occasional divisions of more quiescent cells. We support this model by 5-ethynyl-2′-deoxyuridine retention experiments in shoot and root apical meristems. These results suggest that stem-cell organization has independently evolved in plants and animals to minimize mutations by limiting DNA replication. mutation rate | shoot apical meristem | germline | mismatch repair | telomeres
Recent evidence suggests that metabolic changes play a pivotal role in the biology of cancer and in particular renal cell carcinoma (RCC). Here, a global metabolite profiling approach was applied to characterize the metabolite pool of RCC and normal renal tissue. Advanced decision tree models were applied to characterize the metabolic signature of RCC and to explore features of metastasized tumours. The findings were validated in a second independent dataset. Vitamin E derivates and metabolites of glucose, fatty acid, and inositol phosphate metabolism determined the metabolic profile of RCC. α-tocopherol, hippuric acid, myoinositol, fructose-1-phosphate and glucose-1-phosphate contributed most to the tumour/normal discrimination and all showed pronounced concentration changes in RCC. The identified metabolic profile was characterized by a low recognition error of only 5% for tumour versus normal samples. Data on metastasized tumours suggested a key role for metabolic pathways involving arachidonic acid, free fatty acids, proline, uracil and the tricarboxylic acid cycle. These results illustrate the potential of mass spectroscopy based metabolomics in conjunction with sophisticated data analysis methods to uncover the metabolic phenotype of cancer. Differentially regulated metabolites, such as vitamin E compounds, hippuric acid and myoinositol, provide leads for the characterization of novel pathways in RCC.
Rheumatoid arthritis (RA) is an autoimmune disease characterized by persistent synovial inflammation. The major drivers of synovial inflammation are cytokines and chemokines. Among these molecules, TNF activates fibroblast-like synoviocytes (FLSs), which leads to the production of inflammatory mediators. Here, we show that TNF regulates the expression of the transcription factor interferon regulatory factor 1 (IRF1) in human FLSs as well as in a TNF transgenic arthritis mouse model. Transcriptomic analyses of IRF1-deficient, TNF-stimulated FLSs define the interferon (IFN) pathway as a major target of IRF1. IRF1 expression is associated with the expression of IFNβ, which leads to the activation of the JAK-STAT pathway. Blocking the JAK-STAT pathway with the Janus kinase inhibitor (JAKinib) baricitinib or tofacitinib reduces the expression of IFN-regulated genes (IRGs) in TNF-activated FLSs. Therefore, we conclude that TNF induces a distinct inflammatory cascade, in which IRGs are key elements, in FLSs. The IFN-signature might be a promising biomarker for the efficient and personalized use of new treatment strategies for RA, such as JAKinibs.
BackgroundSingle Nucleotide Polymorphisms (SNPs) are one of the largest sources of new data in biology. In most papers, SNPs between individuals are visualized with Principal Component Analysis (PCA), an older method for this purpose.Principal FindingsWe compare PCA, an aging method for this purpose, with a newer method, t-Distributed Stochastic Neighbor Embedding (t-SNE) for the visualization of large SNP datasets. We also propose a set of key figures for evaluating these visualizations; in all of these t-SNE performs better.SignificanceTo transform data PCA remains a reasonably good method, but for visualization it should be replaced by a method from the subfield of dimension reduction. To evaluate the performance of visualization, we propose key figures of cross-validation with machine learning methods, as well as indices of cluster validity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.