Predicting the molecular complexity of a genomic sequencing library has emerged as a critical but difficult problem in modern applications of genome sequencing. Available methods to determine either how deeply to sequence, or predict the benefits of additional sequencing, are almost completely lacking. We introduce an empirical Bayesian method to implicitly model any source of bias and accurately characterize the molecular complexity of a DNA sample or library in almost any sequencing application.
We report a robust, versatile approach called CRISPR live-cell fluorescent in situ hybridization (LiveFISH) using fluorescent oligonucleotides for genome tracking in a broad range of cell types, including primary cells. An intrinsic stability switch of CRISPR guide RNAs enables LiveFISH to accurately detect chromosomal disorders such as Patau syndrome in prenatal amniotic fluid cells and track multiple loci in human T lymphocytes. In addition, LiveFISH tracks the real-time movement of DNA double-strand breaks induced by CRISPR-Cas9–mediated editing and consequent chromosome translocations. Finally, by combining Cas9 and Cas13 systems, LiveFISH allows for simultaneous visualization of genomic DNA and RNA transcripts in living cells. The LiveFISH approach enables real-time live imaging of DNA and RNA during genome editing, transcription, and rearrangements in single cells.
SUMMARY Comprehensive identification of factors that can specify neuronal fate could provide valuable insights into lineage specification and reprogramming, but systematic interrogation of transcription factors, and their interactions with each other, has proven technically challenging. We developed a CRISPR activation (CRISPRa) approach to systematically identify regulators of neuronal fate specification. We activated expression of all endogenous transcription factors and other regulators via a pooled CRISPRa screen in embryonic stem cells, revealing genes including epigenetic regulators such as Ezh2 that can induce neuronal fate. Systematic CRISPR-based activation of factor pairs allowed us to generate a genetic interaction map for neuronal differentiation, with confirmation of top individual and combinatorial hits as bona fide inducers of neuronal fate. Several factor pairs could directly reprogram fibroblasts into neurons, which shared similar transcriptional programs with endogenous neurons. This study provides an unbiased discovery approach for systematic identification of genes that drive cell fate acquisition.
SUMMARY Programmable control of spatial genome organization is a powerful approach for studying how nuclear structure affects gene regulation and cellular function. Here, we develop a versatile CRISPR-genome organization (CRISPR-GO) system that can efficiently control the spatial positioning of genomic loci relative to specific nuclear compartments, including the nuclear periphery, Cajal bodies, and promyelocytic leukemia (PML) bodies. CRISPR-GO is chemically inducible and reversible, enabling interrogation of real-time dynamics of chromatin interactions with nuclear compartments in living cells. Inducible repositioning of genomic loci to the nuclear periphery allows for dissection of mitosis-dependent and -independent relocalization events and also for interrogation of the relationship between gene position and gene expression. CRISPR-GO mediates rapid de novo formation of Cajal bodies at desired chromatin loci and causes significant repression of endogenous gene expression over long distances (30–600 kb). The CRISPR-GO system offers a programmable platform to investigate large-scale spatial genome organization and function.
Characterizing epigenetic heterogeneity at the cellular level is a critical problem in the modern genomics era. Assays such as single cell ATAC-seq (scATAC-seq) offer an opportunity to interrogate cellular level epigenetic heterogeneity through patterns of variability in open chromatin. However, these assays exhibit technical variability that complicates clear classification and cell type identification in heterogeneous populations. We present scABC, an R package for the unsupervised clustering of single-cell epigenetic data, to classify scATAC-seq data and discover regions of open chromatin specific to cell identity.
Characterizing epigenetic heterogeneity at the cellular level is a critical problem in the modern genomics era. Assays such as single cell ATAC-seq (scATAC-seq) offer an opportunity to interrogate cellular level epigenetic heterogeneity through patterns of variability in open chromatin. However, these assays exhibit technical variability that complicates clear classification and cell type identification in heterogeneous populations. We present scABC, an R package for the unsupervised clustering of single cell epigenetic data, to classify scATAC-seq data and discover regions of open chromatin specific to cell identity.Recent advances in single cell technologies such as scATAC-seq 1, 2 and scChIP-seq 3 have expanded our understanding of epigenetic heterogeneity at the single cell level. However, datasets arising from such technologies are difficult to analyze due to the inherent sparsity. In particular, consider scATAC-seq, designed to interrogate open chromatin in single cells. Open sites in a diploid genome have at most 2 chances to be captured through the assay and only a few thousand distinct reads are generated per cells, resulting in a very low chance that a particular site is captured by the assay. Consequently, it is difficult to determine whether a region is absent in an individual cell due to the lack of openness or due to the sparse nature of data. This creates a challenging task in delineating distinct subpopulations, as only a few genomic regions will have overlapping reads in a large number of cells. To avoid this issue, many studies perform FACS sorting to identify subpopulations, followed by bulk sequencing to determine genomic regions of interest and guide the single cell analysis. If the population is unknown or marker genes are unavailable, then sub-population specific analysis becomes impractical with these techniques.To combat these challenges and allow for the de novo classification of individual cells by their epigenetic signatures, we present a statistical method for the unsupervised clustering of scATAC-seq data, named scABC (single cell Accessibility Based Clustering). In contrast to previous works 2, 4 that demand predefined accessible chromatin sites, our procedure relies solely on the patterns of read counts within genomic regions to cluster cells. It requires two inputs: the individual single cell mapped read files and the full set of called peaks (which can be obtained from the union of all of the individual cells without the need for additional experiments). We apply our method to publicly available scATAC-seq data 1, 2, 4 as well as a true biological mixture to show that our approach can cluster cells with similar epigenetic patterns and identify accessible regions specific to each cluster. We further demonstrate that the cluster specific accessible regions determined by scABC have functional meaning and are capable of determining cellular identity. In particular, we show that these cluster specific accessible regions are enriched for transcription factor motifs known to be...
Genome-wide pooled CRISPR-Cas-mediated knockout, activation, and repression screens are powerful tools for functional genomic investigations. Despite their increasing importance, there is currently little guidance on how to design and analyze CRISPR-pooled screens. Here, we provide a review of the commonly used algorithms in the computational analysis of pooled CRISPR screens. We develop a comprehensive simulation framework to benchmark and compare the performance of these algorithms using both synthetic and real datasets. Our findings inform parameter choices of CRISPR screens and provide guidance to researchers on the design and analysis of pooled CRISPR screens.
Pooled CRISPR screens allow researchers to interrogate genetic causes of complex phenotypes at the genome-wide scale and promise higher specificity and sensitivity compared to competing technologies. Unfortunately, two problems exist, particularly for CRISPRi/a screens: variability in guide efficiency and large rare off-target effects. We present a method, CRISPhieRmix, that resolves these issues by using a hierarchical mixture model with a broad-tailed null distribution. We show that CRISPhieRmix allows for more accurate and powerful inferences in large-scale pooled CRISPRi/a screens. We discuss key issues in the analysis and design of screens, particularly the number of guides needed for faithful full discovery.Electronic supplementary materialThe online version of this article (10.1186/s13059-018-1538-6) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.