15The CRISPR/Cas system is a highly specific genome editing tool capable of distinguishing alleles differing 16 by even a single base pair. However, current tools only design sgRNAs for a reference genome, not taking 17 into account individual variants which may generate, remove, or modify CRISPR/Cas sgRNA sites. This 18 may cause mismatches between designed sgRNAs and the individual genome they are intended to target, 19 leading to decreased experimental performance. Here we describe AlleleAnalyzer, a tool for designing 20 personalized and allele-specific sgRNAs for genome editing. We leverage >2,500 human genomes to 21 identify optimized pairs of sgRNAs that can be used for human therapeutic editing in large populations in 22 the future. 23 24 25 2 Keywords 26 27 CRISPR, sgRNA design, genomics, genome surgery, genome editing, computational biology 28 Background 29The CRISPR/Cas genome-editing system is highly specific, with the ability to discriminate between similar 30 genomic sites, even alleles, based on a single nucleotide difference [1]. In order to target a genomic region 31 with the CRISPR system, a single-guide RNA (sgRNA) must be designed that is specific to the region of 32 interest. While current sgRNA design tools incorporate various data relating to predicted efficiency and 33 specificity such as epigenetic marks and chromatin accessibility [2][3][4], in the vast majority of cases, sgRNAs 34 are designed using reference genomes, such as the hg38 assembly for human or the GRCm38 assembly for 35 mouse. Since sgRNAs are often used on cell lines or organisms with many nucleotide differences from the 36 reference (e.g., on average 0.1% of a human genome [5]). Despite the finding that sgRNAs can sometimes 37 tolerate a single basepair mismatch, these mismatches frequently negatively impact sgRNA efficiency and 38 render imprecise the results of specificity prediction [2, 6, 7]. Furthermore, the use of CRISPR to research 39 areas such as haploinsufficiency, genomic imprinting, and dominant negative diseases require allele-40 specific sgRNA design. To address these challenges, we developed AlleleAnalyzer, a software tool that 41 designs personalized and allele-specific sgRNAs for individual genomes, identifies pairs of sgRNAs to 42 generate excisions likely to block expression of a gene, and leverages patterns of shared variation from 43 >2,500 human genomes to design sgRNA pairs for that will have the greatest utility in a target population. 44 45 Results and Discussion 46 47 Incorporating genetic variation into sgRNA design enables personalized and allele-specific CRISPR All possible personalized sgRNAs for SpCas9, SaCas9 and cpf1 (Cas12a) in the region surrounding the 404 first exon of RHO WTC (Supplementary Figure 6). WTC has no homozygous variants in this region, thus 405 allele frequency and variant-related columns are blank. However, the sgRNAs are designed to avoid the 9 406 heterozygous variants that WTC has in this region.
407408 Supplementary Table 4