Transcriptional activation domains are essential for gene regulation, but their intrinsic disorder and low primary sequence conservation have made it difficult to identify the amino acid composition features that underlie their activity. Here, we describe a rational mutagenesis scheme that deconvolves the function of four activation domain sequence features-acidity, hydrophobicity, intrinsic disorder, and short linear motifs-by quantifying the activity of thousands of variants in vivo and simulating their conformational ensembles using an all-atom Monte Carlo approach. Our results with a canonical activation domain from the Saccharomyces cerevisiae transcription factor Gcn4 reconcile existing observations into a unified model of its function: the intrinsic disorder and acidic residues keep two hydrophobic motifs from driving collapse. Instead, the most-active variants keep their aromatic residues exposed to the solvent. Our results illustrate how the function of intrinsically disordered proteins can be revealed by high-throughput rational mutagenesis.
Humans carry a much larger percentage of body fat than other primates. Despite the central role of adipose tissue in metabolism, little is known about the evolution of white adipose tissue in primates. Phenotypic divergence is often caused by genetic divergence in cis-regulatory regions. We examined the cis-regulatory landscape of fat during human origins by performing comparative analyses of chromatin accessibility in human and chimpanzee adipose tissue using rhesus macaque as an outgroup. We find that many regions that have decreased accessibility in humans are enriched for promoter and enhancer sequences, are depleted for signatures of negative selection, are located near genes involved with lipid metabolism, and contain a short sequence motif involved in the beigeing of fat, the process in which lipid-storing white adipocytes are transdifferentiated into thermogenic beige adipocytes. The collective closing of many putative regulatory regions associated with beigeing of fat suggests a mechanism that increases body fat in humans.
Changes in transcriptional regulation are thought to be a major contributor to the evolution of phenotypic traits, but the contribution of changes in chromatin accessibility to the evolution of gene expression remains almost entirely unknown. To address this important gap in knowledge, we developed a new method to identify DNase I Hypersensitive (DHS) sites with differential chromatin accessibility between species using a joint modeling approach. Our method overcomes several limitations inherent to conventional threshold-based pairwise comparisons that become increasingly apparent as the number of species analyzed rises. Our approach employs a single quantitative test which is more sensitive than existing pairwise methods. To illustrate, we applied our joint approach to DHS sites in fibroblast cells from five primates (human, chimpanzee, gorilla, orangutan, and rhesus macaque). We identified 89,744 DHS sites, of which 41% are identified as differential between species using the joint model compared with 33% using the conventional pairwise approach. The joint model provides a principled approach to distinguishing single from multiple chromatin accessibility changes among species. We found that nondifferential DHS sites are enriched for nucleotide conservation. Differential DHS sites with decreased chromatin accessibility relative to rhesus macaque occur more commonly near transcription start sites (TSS), while those with increased chromatin accessibility occur more commonly distal to TSS. Further, differential DHS sites near TSS are less cell type-specific than more distal regulatory elements. Taken together, these results point to distinct classes of DHS sites, each with distinct characteristics of selection, genomic location, and cell type specificity.
Humans carry a much larger percentage of body fat than other primates. Despite the central role of adipose tissue in metabolism, little is known about the evolution of white adipose tissue in primates. Phenotypic divergence is often caused by genetic divergence in cis-regulatory regions. We examined the cis-regulatory landscape of fat during human origins by performing comparative analyses of chromatin accessibility in human and chimpanzee adipose tissue using macaque as an outgroup. We find that many cis-regulatory regions that are specifically closed in humans are under positive selection, located near genes involved with lipid metabolism, and contain a short sequence motif involved in the beigeing of fat, the process in which white adipocytes are transdifferentiated into beige adipocytes. While the primary role of white adipocytes is to store lipids, beige adipocytes are thermogeneic. The collective closing of many putative regulatory regions associated with beiging of fat suggests an adaptive mechanism that increases body fat in humans.
Transcriptional activation domains are intrinsically disordered peptides with little primary sequence conservation. These properties have made it difficult to identify the sequence features that define activation domains. For example, although acidic activation domains were discovered 30 years ago, we still do not know what role, if any, acidic residues play in these peptides. To address this question we designed a rational mutagenesis scheme to independently test four sequence features theorized to control the strength of activation domains: acidity (negative charge), hydrophobicity, intrinsic disorder, and short linear motifs. To test enough mutants to deconvolve these four features we developed a method to quantify the activities of thousands of activation domain variants in parallel. Our results with Gcn4, a classic acidic activation domain, suggest that acidic residues in particular regions keep two hydrophobic motifs exposed to solvent. We also found that the specific activity of the Gcn4 activation domain increases during amino acid starvation. Our results suggest that Gcn4 may have evolved to have low activity but high inducibility. Our results also demonstrate that high-throughput rational mutation scans will be powerful tools for unraveling the properties that control how intrinsically disordered proteins function.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.