Nucleosomes regulate many DNA-dependent processes by controlling the accessibility of DNA, and DNA sequences such as the poly-dA:dT element are known to affect nucleosome binding. We demonstrate that poly-dA:dT tracts form an asymmetric barrier to nucleosome movement in vivo, mediated by ATP-dependent chromatin remodelers. We theorize that nucleosome transit over poly-A elements is more energetically favourable in one direction, leading to an asymmetric arrangement of nucleosomes around these sequences. We demonstrate that different arrangements of poly-A and poly-T tracts result in very different outcomes for nucleosome occupancy in yeast, mouse, and human, and show that yeast takes advantage of this phenomenon in its promoter architecture.
Predicting how transcription factors (TFs) interpret regulatory sequences to control gene expression remains a major challenge. Past studies have primarily focused on native or engineered sequences, and thus remained limited in scale. Here, we use random sequences as an alternative, measuring the expression output of nearly 100 million synthetic yeast promoters comprised of random DNA. Random sequences yield a broad range of reproducible expression levels, indicating that the fortuitous binding sites in random DNA are functional. From this data we learn 'billboard' models of transcriptional regulation that explain 93% of expression variation of test data, recapitulate the organization of native chromatin in yeast, and help refine cis-regulatory motifs. Analyzing the residual variation, we uncover more complex regulatory mechanisms, such as strand, position, and helical face preferences of TFs. Such high-throughput regulatory assays of random DNA provide the large-scale data necessary to learn complex models of cis-regulatory logic.
Genomes encode for genes and the regulatory signals that enable those genes to be transcribed, and are continually shaped by evolution. Genomes, including those of human and yeast, encode for numerous regulatory elements and transcripts that have limited evidence of conservation or function. Here, we sought to create a genomic null hypothesis by quantifying the gene regulatory activity of evolutionarily naïve DNA, using RNA-seq of evolutionarily distant DNA expressed in yeast and computational predictions of random DNA activity in human cells and tissues. In yeast, we found that >99% of bases in naïve DNA expressed as part of one or more transcripts. Naïve transcripts are sometimes spliced, and are similar to evolved transcripts in length and expression distribution, indicating that stable expression and/or splicing are insufficient to indicate adaptation. However, naïve transcripts do not achieve the extreme high expression levels as achieved by evolved genes, and frequently overlap with antisense transcription, suggesting that selection has shaped the yeast transcriptome to achieve high expression and coherent gene structures. In humans, we found that, while random DNA is predicted to have minimal activity, dinucleotide content-matched randomized DNA is predicted to have much of the regulatory activity of evolved sequences, including active chromatin marks at between half (DNase I and H3K4me3) and 1/16th (H3K27ac and H3K4me1) the rate of evolved DNA, and the repression-associated H3K27me3 at about twice the rate of evolved DNA. Naïve human DNA is predicted to be more cell type-specific than evolved DNA and is predicted to generate co-occurring chromatin marks, indicating that these are not reliable indicators of selection. However, extreme high activity is rarely achieved by naïve DNA, consistent with these arising via selection. Our results indicate that evolving regulatory activity from naïve DNA is comparatively easy in both yeast and humans, and we expect to see many biochemically active and cell type-specific DNA sequences in the absence of selection. Such naïve biochemically active sequences have the potential to evolve a function or, if sufficiently detrimental, selection may act to repress them.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.