Summary
Following Cas9 cleavage, DNA repair without a donor template is generally considered stochastic, heterogeneous, and impractical beyond gene disruption. Here, we show that template-free Cas9 editing is predictable and capable of precise repair to a predicted genotype, enabling correction of human disease-associated mutations. We constructed a library of 2,000 Cas9 guide RNAs (gRNAs) paired with DNA target sites and trained inDelphi, a machine learning model that predicts genotypes and frequencies of 1- to 60-bp deletions and 1-bp insertions with high accuracy (r = 0.87) in five human and mouse cell lines. inDelphi predicts that 5–11% of Cas9 gRNAs targeting the human genome are “precise-50”, yielding a single genotype comprising ≥50% of all major editing products. We experimentally confirmed precise-50 insertions and deletions in 195 human disease-relevant alleles, including correction in primary patient-derived fibroblasts of pathogenic alleles to wild-type genotype for Hermansky-Pudlak syndrome and Menkes disease. This study establishes an approach for precise, template-free genome editing.
Highlights d Base editing outcome precision and efficiency are frequently unintuitive d Machine learning model (BE-Hive) accurately predicts base editing efficiency and editing patterns d Base editor engineering can increase and reduce aberrant transversion editing d We precisely correct 3,388 pathogenic SNVs, many previously considered intractable
The targeting scope of
Streptococcus pyogenes
Cas9 (SpCas9) and its engineered variants is largely restricted to protospacer-adjacent motif (PAM) sequences containing Gs. Here, we report the evolution of three new SpCas9 variants that collectively recognize NRNH PAMs (where R = A or G and H = A, C, or T) using phage-assisted non-continuous evolution (PANCE), three new phage-assisted continuous evolution (PACE) strategies for DNA binding, and a secondary selection for DNA cleavage. The targeting capabilities of these evolved variants and SpCas9-NG were characterized in HEK293T cells using a library of 11,776 genomically integrated protospacer-sgRNA pairs containing all possible NNNN PAMs. The evolved variants mediate indel formation and base editing in human cells and enable the A•T-to-G•C base editing of a sickle-cell anemia mutation using a previously inaccessible CACC PAM. These new evolved SpCas9s, together with previously reported variants, in principle enable targeting the majority of NR PAM sequences and substantially reduce the fraction of genomic sites that are inaccessible by Cas9-based methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.