Base editing (BE) can be applied to characterize single nucleotide variants (SNVs) of unknown function, yet defining effective combinations of single guide RNAs (sgRNAs) and base editors remains challenging. Here, we describe modular BE-activity ‘sensors’ that link sgRNAs and cognate target sites in cis and use them to systematically measure the editing efficiency and precision of thousands of sgRNAs paired with functionally distinct base editors. By quantifying sensor editing across >200,000 editor–sgRNA combinations, we provide a comprehensive resource of sgRNAs for introducing and interrogating cancer-associated SNVs in multiple model systems. We demonstrate that sensor-validated tools streamline production of in vivo cancer models, and that integrating sensor modules in pooled sgRNA libraries can aid interpretation of high-throughput BE screens. Using this approach, we identify several previously uncharacterized mutant TP53 alleles as drivers of cancer cell proliferation and in vivo tumor development. We anticipate that the framework described here will facilitate the functional interrogation of cancer variants in cell and animal models.
CRISPR-Cas9 based genome editing combined with single-cell sequencing enables the tracing of the history of cell divisions, or cellular lineage, in tissues and whole organisms. While standard phylogenetic approaches may be applied to reconstruct cellular lineage trees from this data, the unique features of the CRISPR-Cas9 editing process motivate the development of specialized models that describe the evolution of CRISPR-Cas9 induced mutations. Here, we introduce the star homoplasy model, a novel evolutionary model that constrains a phylogenetic character to mutate at most once along a lineage, capturing the non-modifiability property of CRISPR-Cas9 mutations. We derive a combinatorial characterization of star homoplasy phylogenies by identifying a relationship between the star homoplasy model and the binary perfect phylogeny model. We use this characterization to develop an algorithm, Startle (Star tree lineage estimator), that computes a maximum parsimony star homoplasy phylogeny. We demonstrate that Startle infers more accurate phylogenies on simulated CRISPR-based lineage tracing data compared to existing methods; particularly on data with high amounts of dropout and homoplasy. Startle also infers more parsimonious phylogenies with fewer metastatic migrations on a lineage tracing dataset from mouse metastatic lung adenocarcinoma.
The ENCODE4 Consortiums efforts to annotate non-coding, cis-regulatory elements (CREs) have advanced our understanding of gene regulatory landscapes which play a major role in health and disease. Pooled, non-coding CRISPR screens are a promising approach for systematically investigating gene regulatory mechanisms. Here, the ENCODE4 Functional Characterization Centers report 109 screens comprising 346,970 individual perturbations across 13.3Mb of the genome, using a variety of methods, readouts, and statistical analyses. Across 332 functionally confirmed CRE-gene links, we identify principles for screening endogenous, non-coding elements for causal regulatory mechanisms. Nearly all CREs show strong evidence of open chromatin, and targeting accessibility peak summits is a critical component of our proposed sgRNA design rules. We provide experimental guidelines to accurately detect CREs with variable, often low, transcriptional effects. We discover a previously undescribed DNA strand-bias for CRISPRi in transcribed regions with implications for screen design and analysis. Benchmarking five screen analysis tools, we find CASA produces the most conservative CRE calls and is robust to artifacts of low-specificity sgRNAs. Together, we provide an accessible data resource, predesigned sgRNAs targeting 3,275,697 ENCODE SCREEN cCREs, and screening guidelines to accelerate functional characterization of the non-coding genome.
Motivation: New low-coverage single-cell DNA sequencing technologies enable the measurement of copy number profiles from thousands of individual cells within tumors. From this data, one can infer the evolutionary history of the tumor by modeling transformations of the genome via copy number aberrations. A widely used model to infer such copy number phylogenies is the copy number transformation (CNT) model in which a genome is represented by an integer vector and a copy number aberration is an event that either increases or decreases the number of copies of a contiguous segment of the genome. The CNT distance between a pair of copy number profiles is the minimum number of events required to transform one profile to another. While this distance can be computed efficiently, no efficient algorithm has been developed to find the most parsimonious phylogeny under the CNT model. Results: We introduce the zero-agnostic copy number transformation (ZCNT) model, a simplification of the CNT model that allows the amplification or deletion of regions with zero copies. We derive a closed form expression for the ZCNT distance between two copy number profiles and show that, unlike the CNT distance, the ZCNT distance forms a metric. We leverage the closed-form expression for the ZCNT distance and an alternative characterization of copy number profiles to derive polynomial time algorithms for two natural relaxations of the small parsimony problem on copy number profiles. While the alteration of zero copy number regions allowed under the ZCNT model is not biologically realistic, we show on both simulated and real datasets that the ZCNT distance is a close approximation to the CNT distance. Extending our polynomial time algorithm for the ZCNT small parsimony problem, we develop an algorithm, Lazac, for solving the large parsimony problem on copy number profiles. We demonstrate that Lazac outperforms existing methods for inferring copy number phylogenies on both simulated and real data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.