The Polycomb-group proteins (PcG) and Trithorax-group proteins (TrxG) are two major epigenetic regulators important for proper differentiation during development (1,2). In Drosophila melanogaster (D. melanogaster), Polycomb response elements (PREs) are short segments of DNA with a high density of binding sites for transcription factors (TFs) that recruit PcG and TrxG proteins to chromatin. Each PRE has a different number of binding sites for PcG and TrxG, and these binding sites have different topological organizations. It is thus difficult to find general rules to discover the locations of PREs over the entire genome. We have developed a framework to predict the locations and roles of potential PRE regions over the entire D. melanogaster genome using machine learning algorithms. Using a combination of motif-based and simple sequence-based features, we were able to train a random forest (RF) model with very high performance in predicting active PRE regions. This model could distinguish potential PRE regions from non-PRE regions (precision and recall ~0.92 upon cross-validation). In the process, the model suggests that previously unrecognized TFs might contribute to PcG/TrxG recruitment at the PRE locations, as the presence of binding sites for those factors is strongly informative of active PREs. A secondary regression model provides information on features that further differentiate PREs into functional subclasses. Our findings provide both new predictions of 7887 potential PREs in the D. melanogaster genome, and new mechanistic insight into the set of DNA-associated proteins that may contribute to PcG recruitment and/or activity. Author summaryDuring the development of multicellular organisms, the pattern of gene expression for every cell type must be established and then inherited through cell division. DNA cis-regulatory elements called Polycomb response elements (PREs) are major drivers of the spatio-temporal regulation of many genes during development. Recently, it has been reported that, depending on the cell type and condition, these elements could have dual functionality: a PRE can act as a silencing element in one cell type and an enhancer in others. Integrating all binding information of characterized TFs with DNA sequence features allowed us to construct a highly informative machine learning model to predict and classify PRE locations. Applying our model to Drosophila both allowed us to identify many new putative PREs (15-16 times the number currently known from experimental studies) and provides insight into the specific DNA-binding proteins that may additionally contribute to PRE function.
Cells adapt to changes in their environment through transcriptional responses that are hard-coded in their regulatory networks. Such dedicated pathways, however, may be inadequate for adaptation to novel or extreme environments. We propose the existence of a fitness optimization mechanism that tunes the global transcriptional output of a genome to match arbitrary external conditions in the absence of dedicated gene-regulatory networks. We provide evidence for the proposed tuning mechanism in the adaptation of Saccharomyces cerevisiae to laboratory-engineered environments that are foreign to its native gene-regulatory network. We show that transcriptional tuning operates locally at individual gene promoters and its efficacy is modulated by genetic perturbations to chromatin modification machinery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.