Independent component analysis (ICA) of bacterial transcriptomes has emerged as a powerful tool for obtaining co-regulated, independently-modulated gene sets (iModulons), inferring their activities across a range of conditions, and enabling their association to known genetic regulators. By grouping and analyzing genes based on observations from big data alone, iModulons can provide a novel perspective into how the composition of the transcriptome adapts to environmental conditions. Here, we present iModulonDB (imodulondb.org), a knowledgebase of prokaryotic transcriptional regulation computed from high-quality transcriptomic datasets using ICA. Users select an organism from the home page and then search or browse the curated iModulons that make up its transcriptome. Each iModulon and gene has its own interactive dashboard, featuring plots and tables with clickable, hoverable, and downloadable features. This site enhances research by presenting scientists of all backgrounds with co-expressed gene sets and their activity levels, which lead to improved understanding of regulator-gene relationships, discovery of transcription factors, and the elucidation of unexpected relationships between conditions and genetic regulatory activity. The current release of iModulonDB covers three organisms (Escherichia coli, Staphylococcus aureus and Bacillus subtilis) with 204 iModulons, and can be expanded to cover many additional organisms.
Escherichia coli uses two-component systems (TCSs) to respond to environmental signals. TCSs affect gene expression and are parts of E. coli’s global transcriptional regulatory network (TRN). Here, we identified the regulons of five TCSs in E. coli MG1655: BaeSR and CpxAR, which were stimulated by ethanol stress; KdpDE and PhoRB, induced by limiting potassium and phosphate, respectively; and ZraSR, stimulated by zinc. We analyzed RNA-seq data using independent component analysis (ICA). ChIP-exo data were used to validate condition-specific target gene binding sites. Based on these data, we do the following: (i) identify the target genes for each TCS; (ii) show how the target genes are transcribed in response to stimulus; and (iii) reveal novel relationships between TCSs, which indicate noncognate inducers for various response regulators, such as BaeR to iron starvation, CpxR to phosphate limitation, and PhoB and ZraR to cell envelope stress. Our understanding of the TRN in E. coli is thus notably expanded. IMPORTANCE E. coli is a common commensal microbe found in the human gut microenvironment; however, some strains cause diseases like diarrhea, urinary tract infections, and meningitis. E. coli’s two-component systems (TCSs) modulate target gene expression, especially related to virulence, pathogenesis, and antimicrobial peptides, in response to environmental stimuli. Thus, it is of utmost importance to understand the transcriptional regulation of TCSs to infer bacterial environmental adaptation and disease pathogenicity. Utilizing a combinatorial approach integrating RNA sequencing (RNA-seq), independent component analysis, chromatin immunoprecipitation coupled with exonuclease treatment (ChIP-exo), and data mining, we suggest five different modes of TCS transcriptional regulation. Our data further highlight noncognate inducers of TCSs, which emphasizes the cross-regulatory nature of TCSs in E. coli and suggests that TCSs may have a role beyond their cognate functionalities. In summary, these results can lead to an understanding of the metabolic capabilities of bacteria and correctly predict complex phenotype under diverse conditions, especially when further incorporated with genome-scale metabolic models.
23Escherichia coli uses two-component systems (TCSs) to respond to environmental 24 signals. TCSs affect gene expression and are parts of E. coli's global transcriptional regulatory 25 network (TRN). Here, we identified the regulons of five TCSs in E. coli MG1655: BaeSR and 26 CpxAR, which were stimulated by ethanol stress; KdpDE and PhoRB, induced by limiting 27 potassium and phosphate, respectively; and ZraSR, stimulated by zinc. We analyzed RNA-seq 28 data using independent component analysis (ICA). ChIP-exo data was used to validate condition-29 specific target gene binding sites. Based on this data we (1) identify the target genes for each 30 TCS; (2) show how the target genes are transcribed in response to stimulus; and (3) reveal novel 31 relationships between TCSs, which indicate non-cognate inducers for various response 32 regulators, such as BaeR to iron starvation, CpxR to phosphate limitation, and PhoB and ZraR to 33 cell envelope stress. Our understanding of the TRN in E. coli is thus notably expanded. 35Importance 36 E. coli is a common commensal microbe found in human gut microenvironment; 37 however, some strains cause diseases like diarrhea, urinary tract infections and meningitis. E. 38 coli's two-component system (TCS) modulates target gene expression, specially related to 39 virulence, pathogenesis and anti-microbial peptides, in response to environmental stimuli. Thus, 40 it is of utmost importance to understand the transcriptional regulation of the TCSs to infer its 41 environmental adaptation and disease pathogenicity. Utilizing a combinatorial approach 42 integrating RNAseq, independent component analysis, ChIP-exo and data mining, we show that 43 TCSs have five different modes of transcriptional regulation. Our data further highlights non-44 cognate inducers of TCSs emphasizing cross-regulatory nature of TCSs in E. coli and suggests 45 that TCSs may have a role beyond their cognate functionalities. In summary, these results when 46 further incorporated with genome scale metabolic models can lead to understanding of metabolic 47 capabilities of bacteria and correctly predict complex phenotype under diverse conditions. 48 49 Keywords 50 51 Two-component systems, E. coli, independent component analysis, transcriptomics, ChIP-exo, 52 transcriptional regulatory network, gene targets 53 54 Introduction 55 56Bacterial survival and resilience across diverse conditions relies upon environmental 57 sensing and a corresponding response. One pervasive biological design towards this goal consists 58 of a histidine kinase unit to sense the environment and a related response regulator unit to receive 59 the signal and translate it into gene expression changes. This signaling process is known as a 60 two-component system (TCS) (1). In the case of Escherichia coli (E. coli) strain K12 MG1655, 61 there are 30 histidine kinases and 32 response regulators involved in 29 complete two-62 component systems that mediate responses to various environmental stimuli such as metal 63 sen...
The transcriptional regulatory network in prokaryotes controls global gene expression mostly through transcription factors (TFs), which are DNA-binding proteins. Chromatin immunoprecipitation (ChIP) with DNA sequencing methods can identify TF binding sites across the genome, providing a bottom-up, mechanistic understanding of how gene expression is regulated. ChIP provides indispensable evidence toward the goal of acquiring a comprehensive understanding of cellular adaptation and regulation, including condition-specificity. ChIP-derived data's importance and labor-intensiveness motivate its broad dissemination and reuse, which is currently an unmet need in the prokaryotic domain. To fill this gap, we present proChIPdb (prochipdb.org), an information-rich, interactive web database. This website collects public ChIP-seq/-exo data across several prokaryotes and presents them in dashboards that include curated binding sites, nucleotide-resolution genome viewers, and summary plots such as motif enrichment sequence logos. Users can search for TFs of interest or their target genes, download all data, dashboards, and visuals, and follow external links to understand regulons through biological databases and the literature. This initial release of proChIPdb covers diverse organisms, including most major TFs of Escherichia coli, and can be expanded to support regulon discovery across the prokaryotic domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.