SUMMARY Transcription factor (TF) DNA sequence preferences direct their regulatory activity, but are currently known for only ~1% of all eukaryotic TFs. Broadly sampling DNA-binding domain (DBD) types from multiple eukaryotic clades, we determined DNA sequence preferences for >1,000 TFs encompassing 54 different DBD classes from 131 diverse eukaryotes. We find that closely related DBDs almost always have very similar DNA sequence preferences, enabling inference of motifs for ~34% of the ~170,000 known or predicted eukaryotic TFs. Sequences matching both measured and inferred motifs are enriched in ChIP-seq peaks and upstream of transcription start sites in diverse eukaryotic lineages. SNPs defining expression quantitative trait loci in Arabidopsis promoters are also enriched for predicted TF binding sites. Importantly, our motif “library” (http://cisbp.ccbr.utoronto.ca) can be used to identify specific TFs whose binding may be altered by human disease risk alleles. These data present a powerful resource for mapping transcriptional networks across eukaryotes.
CRISPR/Cas9 has revolutionized our ability to engineer genomes and conduct genome-wide screens in human cells. Whereas some cell types are amenable to genome engineering, genomes of human pluripotent stem cells (hPSCs) have been difficult to engineer, with reduced efficiencies relative to tumour cell lines or mouse embryonic stem cells. Here, using hPSC lines with stable integration of Cas9 or transient delivery of Cas9-ribonucleoproteins (RNPs), we achieved an average insertion or deletion (indel) efficiency greater than 80%. This high efficiency of indel generation revealed that double-strand breaks (DSBs) induced by Cas9 are toxic and kill most hPSCs. In previous studies, the toxicity of Cas9 in hPSCs was less apparent because of low transfection efficiency and subsequently low DSB induction. The toxic response to DSBs was P53/TP53-dependent, such that the efficiency of precise genome engineering in hPSCs with a wild-type P53 gene was severely reduced. Our results indicate that Cas9 toxicity creates an obstacle to the high-throughput use of CRISPR/Cas9 for genome engineering and screening in hPSCs. Moreover, as hPSCs can acquire P53 mutations, cell replacement therapies using CRISPR/Cas9-enginereed hPSCs should proceed with caution, and such engineered hPSCs should be monitored for P53 function.
The repair outcomes at site-specific DNA double-strand breaks (DSBs) generated by the RNA-guided DNA endonuclease Cas9 determine how gene function is altered. Despite the widespread adoption of CRISPR-Cas9 technology to induce DSBs for genome engineering, the resulting repair products have not been examined in depth. Here, the DNA repair profiles of 223 sites in the human genome demonstrate that the pattern of DNA repair following Cas9 cutting at each site is nonrandom and consistent across experimental replicates, cell lines, and reagent delivery methods. Furthermore, the repair outcomes are determined by the protospacer sequence rather than genomic context, indicating that DNA repair profiling in cell lines can be used to anticipate repair outcomes in primary cells. Chemical inhibition of DNA-PK enabled dissection of the DNA repair profiles into contributions from c-NHEJ and MMEJ. Finally, this work elucidates a strategy for using "error-prone" DNA-repair machinery to generate precise edits.
Transcription regulatory networks consist of physical and functional interactions between transcription factors (TFs) and their target genes. The systematic mapping of TF-target gene interactions has been pioneered in unicellular systems, using "TF-centered" methods (e.g., chromatin immunoprecipitation). However, metazoan systems are less amenable to such methods. Here, we used "gene-centered" high-throughput yeast one-hybrid (Y1H) assays to identify 283 interactions between 72 C. elegans digestive tract gene promoters and 117 proteins. The resulting protein-DNA interaction (PDI) network is highly connected and enriched for TFs that are expressed in the digestive tract. We provide functional annotations for approximately 10% of all worm TFs, many of which were previously uncharacterized, and find ten novel putative TFs, illustrating the power of a gene-centered approach. We provide additional in vivo evidence for multiple PDIs and illustrate how the PDI network provides insights into metazoan differential gene expression at a systems level.
The Caenorhabditiselegans genome encodes more than 100 microRNAs (miRNAs). Genetic analyses of miRNA deletion mutants have only provided limited insights into miRNA function. To gain insight into the function of miRNAs, it is important to determine their spatiotemporal expression pattern. Here, we use miRNA promoters driving the expression of GFP as a proxy for miRNA expression. We describe a set of 73 transgenic C. elegans strains, each expressing GFP under the control of a miRNA promoter. Together, these promoters control the expression of 89 miRNAs (66% of all predicted miRNAs). We find that miRNA promoters drive GFP expression in a variety of tissues and that, overall, their activity is similar to that of protein-coding gene promoters. However, miRNAs are expressed later in development, which is consistent with functions after initial body-plan specification. We find that miRNA members belonging to families are more likely to be expressed in overlapping tissues than miRNAs that do not belong to the same family, and provide evidence that intronic miRNAs may be controlled by their own, rather than a host gene promoter. Finally, our data suggest that post-transcriptional mechanisms contribute to differential miRNA expression. The data and strains described here will provide a valuable guide and resource for the functional analysis of C. elegans miRNAs.
Biliary epithelial cells (BECs) form bile ducts in the liver and are facultative liver stem cells that establish a ductular reaction (DR) to support liver regeneration following injury. Liver damage induces periportal LGR5+ putative liver stem cells that can form BEClike organoids, suggesting that RSPO-LGR4/5-mediated WNT/b-catenin activity is important for a DR. We addressed the roles of this and other signaling pathways in a DR by performing a focused CRISPRbased loss-of-function screen in BEC-like organoids, followed by in vivo validation and single-cell RNA sequencing. We found that BECs lack and do not require LGR4/5-mediated WNT/b-catenin signaling during a DR, whereas YAP and mTORC1 signaling are required for this process. Upregulation of AXIN2 and LGR5 is required in hepatocytes to enable their regenerative capacity in response to injury. Together, these data highlight heterogeneity within the BEC pool, delineate signaling pathways involved in a DR, and clarify the identity and roles of injury-induced periportal LGR5+ cells.
Differential regulation of gene expression is essential for cell fate specification in metazoans. Characterizing the transcriptional activity of gene promoters, in time and in space, is therefore a critical step toward understanding complex biological systems. Here we present an in vivo spatiotemporal analysis for approximately 900 predicted C. elegans promoters (approximately 5% of the predicted protein-coding genes), each driving the expression of green fluorescent protein (GFP). Using a flow-cytometer adapted for nematode profiling, we generated 'chronograms', two-dimensional representations of fluorescence intensity along the body axis and throughout development from early larvae to adults. Automated comparison and clustering of the obtained in vivo expression patterns show that genes coexpressed in space and time tend to belong to common functional categories. Moreover, integration of this data set with C. elegans protein-protein interactome data sets enables prediction of anatomical and temporal interaction territories between protein partners.
Insulin/IGF-1 signaling controls metabolism, stress resistance and aging in Caenorhabditis elegans by regulating the activity of the DAF-16/FoxO transcription factor (TF). However, the function of DAF-16 and the topology of the transcriptional network that it crowns remain unclear. Using chromatin profiling by DNA adenine methyltransferase identification (DamID), we identified 907 genes that are bound by DAF-16. These were enriched for genes showing DAF-16-dependent upregulation in long-lived daf-2 insulin/IGF-1 receptor mutants (P=1.4e−11). Cross-referencing DAF-16 targets with these upregulated genes (daf-2 versus daf-16; daf-2) identified 65 genes that were DAF-16 regulatory targets. These 65 were enriched for signaling genes, including known determinants of longevity, but not for genes specifying somatic maintenance functions (e.g. detoxification, repair). This suggests that DAF-16 acts within a relatively small transcriptional subnetwork activating (but not suppressing) other regulators of stress resistance and aging, rather than directly regulating terminal effectors of longevity. For most genes bound by DAF-16∷DAM, transcriptional regulation by DAF-16 was not detected, perhaps reflecting transcriptionally non-functional TF ‘parking sites'. This study demonstrates the efficacy of DamID for chromatin profiling in C. elegans.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.