Underlying cellular responses is a transcriptional regulatory network (TRN) that modulates gene expression. A useful description of the TRN would decompose the transcriptome into targeted effects of individual transcriptional regulators. Here, we apply unsupervised machine learning to a diverse compendium of over 250 high-quality Escherichia coli RNA-seq datasets to identify 92 statistically independent signals that modulate the expression of specific gene sets. We show that 61 of these transcriptomic signals represent the effects of currently characterized transcriptional regulators. Condition-specific activation of signals is validated by exposure of E. coli to new environmental conditions. The resulting decomposition of the transcriptome provides: a mechanistic, systems-level, network-based explanation of responses to environmental and genetic perturbations; a guide to gene and regulator function discovery; and a basis for characterizing transcriptomic differences in multiple strains. Taken together, our results show that signal summation describes the composition of a model prokaryotic transcriptome.
Adaptive laboratory evolution (ALE) has emerged as an effective tool for scientific discovery and addressing biotechnological needs. Much of ALE's utility is derived from reproducibly obtained fitness increases. Identifying causal genetic changes and their combinatorial effects is challenging and time-consuming. Understanding how these genetic changes enable increased fitness can be difficult. A series of approaches that address these challenges was developed and demonstrated using Escherichia coli K-12 MG1655 on glucose minimal media at 37°C. By keeping E. coli in constant substrate excess and exponential growth, fitness increases up to 1.6-fold were obtained compared to the wild type. These increases are comparable to previously reported maximum growth rates in similar conditions but were obtained over a shorter time frame. Across the eight replicate ALE experiments performed, causal mutations were identified using three approaches: identifying mutations in the same gene/region across replicate experiments, sequencing strains before and after computationally determined fitness jumps, and allelic replacement coupled with targeted ALE of reconstructed strains. Three genetic regions were most often mutated: the global transcription gene rpoB, an 82-bp deletion between the metabolic pyrE gene and rph, and an IS element between the DNA structural gene hns and tdk. Model-derived classification of gene expression revealed a number of processes important for increased growth that were missed using a gene classification system alone. The methods described here represent a powerful combination of technologies to increase the speed and efficiency of ALE studies. The identified mutations can be examined as genetic parts for increasing growth rate in a desired strain and for understanding rapid growth phenotypes.A daptive laboratory evolution (ALE) is a growing field facilitated by whole-genome sequencing. The process of ALE involves the continuous culturing of an organism over multiple generations. During an ALE experiment, mutations arise, and those beneficial to the selection pressure are fixed over time in the population. Most ALE experiments analyze a perturbation from a reference state to another (e.g., environmental [1,2] or genetic [3]). After adaptation, understanding what genetic changes enabled an increase in fitness is often desirable (4). Generally there are two methods of evolving microorganisms: batch cultures and chemostats. Each method has its own advantages and disadvantages, in terms of maintenance, growth environment, and selection pressures (5). Applications of ALE are numerous and include those for biotechnological goals, such as improving tolerance to a given compound of interest (6-8), or more progressive uses such as improving electrical current consumption in an organism (9). In addition, there has been a significant focus on using ALE to understand antibiotic resistance to given compounds (i.e., drugs) in order to combat clinical resistance (10). A number of in-depth reviews on ALE have appeared ...
The ferric uptake regulator (Fur) plays a critical role in the transcriptional regulation of iron metabolism. However, the full regulatory potential of Fur remains undefined. Here we comprehensively reconstruct the Fur transcriptional regulatory network in Escherichia coli K-12 MG1655 in response to iron availability using genome-wide measurements (ChIP-exo and RNA-seq). Integrative data analysis reveals that a total of 81 genes in 42 transcription units are directly regulated by three different modes of Fur regulation, including apo- and holo-Fur activation and holo-Fur repression. We show that Fur connects iron transport and utilization enzymes with negative-feedback loop pairs for iron homeostasis. In addition, direct involvement of Fur in the regulation of DNA synthesis, energy metabolism, and biofilm development is found. These results show how Fur exhibits a comprehensive regulatory role affecting many fundamental cellular processes linked to iron metabolism in order to coordinate the overall response of E. coli to iron availability.
Three transcription factors (TFs), OxyR, SoxR, and SoxS, play a critical role in transcriptional regulation of the defense system for oxidative stress in bacteria. However, their full genome-wide regulatory potential is unknown. Here, we perform a genome-scale reconstruction of the OxyR, SoxR, and SoxS regulons in Escherichia coli K-12 MG1655. Integrative data analysis reveals that a total of 68 genes in 51 transcription units (TUs) belong to these regulons. Among them, 48 genes showed more than 2-fold changes in expression level under single-TF-knockout conditions. This reconstruction expands the genome-wide roles of these factors to include direct activation of genes related to amino acid biosynthesis (methionine and aromatic amino acids), cell wall synthesis (lipid A biosynthesis and peptidoglycan growth), and divalent metal ion transport (Mn(2+), Zn(2+), and Mg(2+)). Investigating the co-regulation of these genes with other stress-response TFs reveals that they are independently regulated by stress-specific TFs.
Rapid growth in size and complexity of biological data sets has led to the ‘Big Data to Knowledge' challenge. We develop advanced data integration methods for multi-level analysis of genomic, transcriptomic, ribosomal profiling, proteomic and fluxomic data. First, we show that pairwise integration of primary omics data reveals regularities that tie cellular processes together in Escherichia coli: the number of protein molecules made per mRNA transcript and the number of ribosomes required per translated protein molecule. Second, we show that genome-scale models, based on genomic and bibliomic data, enable quantitative synchronization of disparate data types. Integrating omics data with models enabled the discovery of two novel regularities: condition invariant in vivo turnover rates of enzymes and the correlation of protein structural motifs and translational pausing. These regularities can be formally represented in a computable format allowing for coherent interpretation and prediction of fitness and selection that underlies cellular physiology.
Comprehensive and systematic analysis of airway gene expression represents a strategy for addressing the multiple, complex, and largely untested hypotheses that exist for disease mechanisms, including asthma. Here, we report a novel real-time PCR-based method specifically designed for quantification of multiple low-abundance transcripts using as little as 2.5 fg of total RNA per gene. This method of gene expression profiling has the same specificity and sensitivity as RT-PCR and a throughput level comparable to low-density DNA microarray hybridization. In this two-step method, multiplex RT-PCR is successfully combined with individual gene quantification via real-time PCR on generated cDNA product. Using this method, we measured the expression of 75 genes in bronchial biopsies from asthmatic versus healthy subjects and found expected increases in expression levels of Th2 cytokines and their receptors in asthma. Surprisingly, we also found increased gene expression of NKCC1-a NaUsing immunohistochemical method, we confirmed increased protein expression for NKCC1 in the asthmatic subject with restricted localization to goblet cells. These data validate the new transcriptional profiling method and implicate NKCC1 in the pathophysiology of mucus hypersecretion in asthma. Potential applications for this method include transcriptional profiling in limited numbers of laser captured cells and validation of DNA microarray data in clinical specimens.
The regulators GadE, GadW and GadX (which we refer to as GadEWX) play a critical role in the transcriptional regulation of the glutamate-dependent acid resistance (GDAR) system in Escherichia coli K-12 MG1655. However, the genome-wide regulatory role of GadEWX is still unknown. Here we comprehensively reconstruct the genome-wide GadEWX transcriptional regulatory network and RpoS involvement in E. coli K-12 MG1655 under acidic stress. Integrative data analysis reveals that GadEWX regulons consist of 45 genes in 31 transcription units and 28 of these genes were associated with RpoS-binding sites. We demonstrate that GadEWX directly and coherently regulate several proton-generating/consuming enzymes with pairs of negative-feedback loops for pH homeostasis. In addition, GadEWX regulate genes with assorted functions, including molecular chaperones, acid resistance, stress response and other regulatory activities. These results show how GadEWX simultaneously coordinate many cellular processes to produce the overall response of E. coli to acid stress.
Catalysis using iron–sulfur clusters and transition metals can be traced back to the last universal common ancestor. The damage to metalloproteins caused by reactive oxygen species (ROS) can prevent cell growth and survival when unmanaged, thus eliciting an essential stress response that is universal and fundamental in biology. Here we develop a computable multiscale description of the ROS stress response inEscherichia coli, called OxidizeME. We use OxidizeME to explain four key responses to oxidative stress: 1) ROS-induced auxotrophy for branched-chain, aromatic, and sulfurous amino acids; 2) nutrient-dependent sensitivity of growth rate to ROS; 3) ROS-specific differential gene expression separate from global growth-associated differential expression; and 4) coordinated expression of iron–sulfur cluster (ISC) and sulfur assimilation (SUF) systems for iron–sulfur cluster biosynthesis. These results show that we can now develop fundamental and quantitative genotype–phenotype relationships for stress responses on a genome-wide basis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.