Machine Learning Uncovers a Data-Driven Transcriptional Regulatory Network for the Crenarchaeal Thermoacidophile Sulfolobus acidocaldarius

Chauhan, Siddharth M.; Poudel, Saugat; Rychel, Kevin; Lamoureux, Cameron; Yoo, Reo; Bulushi, Tahani Al; Yuan, Yuan; Palsson, Bernhard Ø.; Sastry, Anand V.

doi:10.3389/fmicb.2021.753521

Cited by 24 publications

(25 citation statements)

References 66 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Taken together, PRECISE-1K and iModulons extracted from it highlight the central role that top-down, data-driven methods must take in transcriptional regulatory network discovery across organisms. Indeed, iModulons have already successfully generated top-down regulatory networks for other organisms (Chauhan et al, 2021; Lim et al, 2022; Poudel et al, 2020; Rajput et al, 2022; Rychel et al, 2020; Sastry et al, 2019; Yoo et al, 2022). The success of PRECISE-1K serves to further cement both the importance of pursuing such efforts and the reliability of the results.…”

Section: Discussionmentioning

confidence: 99%

“…Independent component analysis (ICA) (Comon, 1994) is a signal processing algorithm that outperforms other methods for the extraction of biologically meaningful regulatory modules from gene expression data (Saelens et al, 2018). Application of this method to publicly-available prokaryotic expression data has consistently recovered TRN modules across organisms (Chauhan et al, 2021; Poudel et al, 2020; Rajput et al, 2022; Rychel et al, 2020; Sastry et al, 2019; Yoo et al, 2022). ICA’s effectiveness results from its ability to identify independent groups of genes that vary consistently across samples, regardless of group size or overlapping membership.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A multi-scale transcriptional regulatory network knowledge base forEscherichia coli

Sastry

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Uncovering the structure of the transcriptional regulatory network (TRN) that modulates gene expression in prokaryotes remains an important challenge. Transcriptomics data is plentiful, necessitating the development of scalable methods for converting this data into useful knowledge about the TRN. Previously, we published the PRECISE dataset for Escherichia coli K-12 MG1655, containing 278 RNA-seq datasets created using a standardized protocol. Here, we present PRECISE 2.0, which is nearly three times the size of the original PRECISE dataset and also created using a standardized protocol. We analyze PRECISE 2.0 at multiple scales, demonstrating multiple analytical strategies for extracting knowledge from this dataset. Specifically, we: (1) highlight patterns in gene expression across the dataset; (2) utilize independent component analysis to extract 218 independently modulated groups of genes (iModulons) that describe the TRN at the systems level; (3) demonstrate the utility of iModulons over traditional differential expression analysis; and (4) uncover 6 new potential regulons. Thus, PRECISE 2.0 is a large-scale, high-quality transcriptomics dataset which may be analyzed at multiple scales to yield important biological insights.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

A multi-scale transcriptional regulatory network knowledge base forEscherichia coli

Sastry

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Additionally, we found both the WhiB1 and GroEL/ES complex iModulons play a role in protein synthesis. WhiB1 also contains several genes that code for RNA polymerase subunits, and is likely a translation iModulon that has been seen in the ICA decompositions of other organisms ( 9 , 10 , 45 ). All three iModulons are related to growth and replication, which suggests that cell division is an important response in M. tuberculosis in a decreasing oxygen environment.…”

Section: Resultsmentioning

confidence: 99%

Machine Learning of All Mycobacterium tuberculosis H37Rv RNA-seq Data Reveals a Structured Interplay between Metabolism, Stress Response, and Infection

Yoo

Rychel

Poudel

et al. 2022

mSphere

Self Cite

View full text Add to dashboard Cite

Mycobacterium tuberculosis H37Rv is one of the world's most impactful pathogens, and a large part of the success of the organism relies on the differential expression of its genes to adapt to its environment. The expression of the organism's genes is driven primarily by its transcriptional regulatory network, and most research on the TRN focuses on identifying and quantifying clusters of coregulated genes known as regulons.

show abstract

“…and genes involved in translation such as infA and fusA which encode translation initiation factor IF-1 and elongation factor G respectively(Figure 2b). This iModulon has been enriched in almost all bacteria and archaea for which iModulons have been calculated 20,[22][23][24][25] .…”

Section: Expanding the Usa300 Imodulons Using Rna-sequencing Data Fro...mentioning

confidence: 99%

Coordination of CcpA and CodY regulators in Staphylococcus aureus USA300 strains

Poudel

Hefner

Szubin

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

The complex crosstalk between metabolism and gene regulatory networks makes it difficult to untangle individual constituents and study their precise roles and interactions. To address this issue, we modularized the transcriptional regulatory network (TRN) of the Staphylococcus aureus strain by applying Independent Component Analysis (ICA) to 385 RNA sequencing samples. We then combined the modular TRN model with a metabolic model to study the regulation of carbon and amino acid metabolism. Our analysis showed that regulation of central carbon metabolism by CcpA and amino acid biosynthesis by CodY are closely coordinated. In general, S. aureus increases the expression of CodY-regulated genes in the presence of preferred carbons sources such as glucose. This transcriptional coordination was corroborated by metabolic model simulations that also showed increased amino acid biosynthesis in the presence of glucose. Further, we found that CodY and CcpA cooperatively regulate the expression of ribosome hibernation promoting factor, thus linking metabolic cues with translation. In line with this hypothesis, expression of CodY-regulated genes is tightly correlated with expression of genes encoding ribosomal proteins. Together, we propose a coarse-grained model where expression of S. aureus genes encoding enzymes that control carbon flux and nitrogen flux through the system is coregulated with expression of translation machinery to modularly control protein synthesis. While this work focuses on three key regulators, the full TRN model we present contains 76 total independently modulated sets of genes, each with the potential to uncover other complex regulatory structures and interactions.

show abstract

Machine Learning Uncovers a Data-Driven Transcriptional Regulatory Network for the Crenarchaeal Thermoacidophile Sulfolobus acidocaldarius

Cited by 24 publications

References 66 publications

A multi-scale transcriptional regulatory network knowledge base forEscherichia coli

A multi-scale transcriptional regulatory network knowledge base forEscherichia coli

Machine Learning of All Mycobacterium tuberculosis H37Rv RNA-seq Data Reveals a Structured Interplay between Metabolism, Stress Response, and Infection

Coordination of CcpA and CodY regulators in Staphylococcus aureus USA300 strains

Contact Info

Product

Resources

About