In two-dimensional parameter spaces, nonlinear systems producing solutions of a fixed periodicity form islands of a characteristic shape, called "shrimp"-shaped domains (SSDs). In simulations of electronic circuits, SSDs of different periodicities were recently found to be connected along spirals. By means of a hardware realization of the simulations, we provide a first direct proof of the real-world existence of this phenomenon. An improved description establishes a close experiment-simulation correspondence, and a simplified circuit family demonstrates the homoclinic saddle-focus origin of the phenomenon.
We present the software Condition-specific Regulatory Units Prediction (CRUP) to infer from epigenetic marks a list of regulatory units consisting of dynamically changing enhancers with their target genes. The workflow consists of a novel pre-trained enhancer predictor that can be reliably applied across cell types and species, solely based on histone modification ChIP-seq data. Enhancers are subsequently assigned to different conditions and correlated with gene expression to derive regulatory units. We thoroughly test and then apply CRUP to a rheumatoid arthritis model, identifying enhancer-gene pairs comprising known disease genes as well as new candidate genes.
The binding of transcription factors to short recognition sequences plays a pivotal role in controlling the expression of genes. The sequence and shape characteristics of binding sites influence DNA binding specificity and have also been implicated in modulating the activity of transcription factors downstream of binding. To quantitatively assess the transcriptional activity of tens of thousands of designed synthetic sites in parallel, we developed a synthetic version of STARR-seq (synSTARR-seq). We used the approach to systematically analyze how variations in the recognition sequence of the glucocorticoid receptor (GR) affect transcriptional regulation. Our approach resulted in the identification of a novel highly active functional GR binding sequence and revealed that sequence variation both within and flanking GR’s core binding site can modulate GR activity without apparent changes in DNA binding affinity. Notably, we found that the sequence composition of variants with similar activity profiles was highly diverse. In contrast, groups of variants with similar activity profiles showed specific DNA shape characteristics indicating that DNA shape may be a better predictor of activity than DNA sequence. Finally, using single cell experiments with individual enhancer variants, we obtained clues indicating that the architecture of the response element can independently tune expression mean and cell-to cell variability in gene expression (noise). Together, our studies establish synSTARR as a powerful method to systematically study how DNA sequence and shape modulate transcriptional output and noise.
26The binding of transcription factors to short recognition sequences plays a pivotal role in 27 controlling the expression of genes. The sequence and shape characteristics of binding sites 28 influence DNA binding specificity and have also been implicated in modulating the activity 29 of transcription factors downstream of binding. To quantitatively assess the transcriptional 30 activity of dozens of thousands of designed synthetic sites in parallel, we developed a 31 synthetic version of STARR-seq (synSTARR-seq). We used the approach to systematically 32 analyze how variations in the recognition sequence of the glucocorticoid receptor (GR) 33 affect transcriptional regulation. Our approach resulted in the identification of a novel 34 highly active functional GR binding sequence and revealed that sequence variation both 35 within and flanking GR's core binding site can modulate GR activity without apparent 36 changes in DNA binding affinity. Notably, we found that the sequence composition of 37 variants with similar activity profiles was highly diverse. In contrast, groups of variants 38 with similar activity profiles showed distinct DNA shape characteristics indicating that DNA 39 shape may be a better predictor of activity than DNA sequence. Finally, using single cell 40 experiments with individual enhancer variants, we obtained clues indicating that the 41 architecture of the response element can independently tune expression mean and cell-to 42 cell variability in gene expression (noise). Together, our studies establish synSTARR as a 43 powerful method to systematically study how DNA sequence and shape modulate 44 transcriptional output and noise. 45 46 47 48 3 Keywords 49 Enhancers, transcriptional regulation, glucocorticoid receptor, transcriptional noise, DNA 50 shape 51 52 53The interplay between transcription factors (TFs) and genomically encoded cis-54 regulatory elements plays a key role in specifying where and when genes are expressed. In 55 addition, the architecture of cis-regulatory elements influences the expression level of 56 individual genes. For example, transcriptional output can be tuned by varying the number 57 of TF binding sites, either for a given TF or for distinct TFs, present at an enhancer [1, 2]. 58 Moreover, differences in its DNA-binding sites can modulate the magnitude of 59 transcriptional activation, as exemplified by the glucocorticoid receptor (GR), a hormone-60 activated TF [3][4][5]. The sequence differences can reside within the 15 base pair (bp) core GR 61 binding sequence (GBS) consisting of two imperfect 6 bp palindromic half-sites separated 62 by a 3 bp spacer. Moreover, sequences directly flanking the core also modulate GR activity 63 [3]. However, these sequence-induced changes in activity cannot be explained by affinity 64 [3]. Instead, the flanking nucleotides induce structural changes in both DNA and the DNA 65 binding domain of GR, arguing for their role in tuning GR activity [3]. 66 Notably, the expression level of a gene is typically measured for p...
Shrimps are islands of periodicity within a chaotic sea in phase and parameter spaces of dimensions larger than one. Islands of different periodicities have recently been shown to be connected by spirals that emanate from a joint focal point, paving ways to wander around in parameter space without ever crossing the chaotic sea. We discuss the shrimp building and scaling principles as well as the influence of individual system properties. While the emergence of shrimps has abundantly been demonstrated for artificial systems, we discuss here in detail evidence of rich hierarchies of shrimps in experimental systems. We finally pinpoint the importance of shrimps in the field of bioinformatics.
Motivation: The construction of statistics for summarizing posterior samples returned by a Bayesian phylogenetic study has so far been hindered by the poor geometric insights available into the space of phylogenetic trees, and ad hoc methods such as the derivation of a consensus tree makeup for the ill-definition of the usual concepts of posterior mean, while bootstrap methods mitigate the absence of a sound concept of variance. Yielding satisfactory results with sufficiently concentrated posterior distributions, such methods fall short of providing a faithful summary of posterior distributions if the data do not offer compelling evidence for a single topology.Results: Building upon previous work of Billera et al., summary statistics such as sample mean, median and variance are defined as the geometric median, Fréchet mean and variance, respectively. Their computation is enabled by recently published works, and embeds an algorithm for computing shortest paths in the space of trees. Studying the phylogeny of a set of plants, where several tree topologies occur in the posterior sample, the posterior mean balances correctly the contributions from the different topologies, where a consensus tree would be biased. Comparisons of the posterior mean, median and consensus trees with the ground truth using simulated data also reveals the benefits of a sound averaging method when reconstructing phylogenetic trees.Availability and implementation: We provide two independent implementations of the algorithm for computing Fréchet means, geometric medians and variances in the space of phylogenetic trees. TFBayes: https://github.com/pbenner/tfbayes, TrAP: https://github.com/bacak/TrAP.Contact: philipp.benner@mis.mpg.de
Genome segmentation methods are powerful tools to obtain cell type or tissue-specific genome-wide annotations and are frequently used to discover regulatory elements. However, traditional segmentation methods show low predictive accuracy and their data-driven annotations have some undesirable properties. As an alternative, we developed ModHMM, a highly modular genome segmentation method. Inspired by the supra-Bayesian approach, it incorporates predictions from a set of classifiers. This allows to compute genome segmentations by utilizing state-of-the-art methodology. We demonstrate the method on EN-CODE data and show that it outperforms traditional segmentation methods not only in terms of predictive performance, but also in qualitative aspects. Therefore, ModHMM is a valuable alternative to study the epigenetic and regulatory landscape across and within cell types or tissues.
Background Eukaryotic gene regulation is a complex process comprising the dynamic interaction of enhancers and promoters in order to activate gene expression. In recent years, research in regulatory genomics has contributed to a better understanding of the characteristics of promoter elements and for most sequenced model organism genomes there exist comprehensive and reliable promoter annotations. For enhancers, however, a reliable description of their characteristics and location has so far proven to be elusive. With the development of high-throughput methods such as ChIP-seq, large amounts of data about epigenetic conditions have become available, and many existing methods use the information on chromatin accessibility or histone modifications to train classifiers in order to segment the genome into functional groups such as enhancers and promoters. However, these methods often do not consider prior biological knowledge about enhancers such as their diverse lengths or molecular structure. Results We developed enhancer HMM (eHMM), a supervised hidden Markov model designed to learn the molecular structure of promoters and enhancers. Both consist of a central stretch of accessible DNA flanked by nucleosomes with distinct histone modification patterns. We evaluated the performance of eHMM within and across cell types and developmental stages and found that eHMM successfully predicts enhancers with high precision and recall comparable to state-of-the-art methods, and consistently outperforms those in terms of accuracy and resolution. Conclusions eHMM predicts active enhancers based on data from chromatin accessibility assays and a minimal set of histone modification ChIP-seq experiments. In comparison to other ’black box’ methods its parameters are easy to interpret. eHMM can be used as a stand-alone tool for enhancer prediction without the need for additional training or a tuning of parameters. The high spatial precision of enhancer predictions gives valuable targets for potential knockout experiments or downstream analyses such as motif search.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.