Alternative end-joining (alt-EJ) repair of DNA double-strand breaks is associated with deletions, chromosome translocations, and genome instability. Alt-EJ frequently uses annealing of microhomologous sequences to tether broken ends. When accessible pre-existing microhomologies do not exist, we have postulated that new microhomologies can be created via limited DNA synthesis at secondary-structure forming sequences. This model, called synthesis-dependent microhomology-mediated end joining (SD-MMEJ), predicts that differences between DNA sequences near double-strand breaks should alter repair outcomes in predictable ways. To test this hypothesis, we injected plasmids with sequence variations flanking an I-SceI endonuclease recognition site into I-SceI expressing Drosophila embryos and used Illumina amplicon sequencing to compare repair junctions. As predicted by the model, we found that small changes in sequences near the I-SceI site had major impacts on the spectrum of repair junctions. Bioinformatic analyses suggest that these repair differences arise from transiently forming loops and hairpins within 30 nucleotides of the break. We also obtained evidence for ‘trans SD-MMEJ,’ involving at least two consecutive rounds of microhomology annealing and synthesis across the break site. These results highlight the importance of sequence context for alt-EJ repair and have important implications for genome editing and genome evolution.
Alternative end joining (alt-EJ) mechanisms, such as polymerase theta-mediated end joining, are increasingly recognized as important contributors to inaccurate double-strand break repair. We previously proposed an alt-EJ model whereby short DNA repeats near a double-strand break anneal to form secondary structures that prime limited DNA synthesis. The nascent DNA then pairs with microhomologous sequences on the other break end. This synthesis-dependent microhomology-mediated end joining (SD-MMEJ) explains many of the alt-EJ repair products recovered following I-SceI nuclease cutting in Drosophila. However, sequence-specific factors that influence SD-MMEJ repair remain to be fully characterized. Here, we expand the utility of the SD-MMEJ model through computational analysis of repair products at Cas9-induced double-strand breaks for 1100 different sequence contexts. We find evidence at single nucleotide resolution for sequence characteristics that drive successful SD-MMEJ repair. These include optimal primer repeat length, distance of repeats from the break, flexibility of DNA sequence between primer repeats, and positioning of microhomology templates relative to preferred primer repeats. In addition, we show that DNA polymerase theta is necessary for most SD-MMEJ repair at Cas9 breaks. The analysis described here includes a computational pipeline that can be utilized to characterize preferred mechanisms of alt-EJ repair in any sequence context.
There are many applications in which quantitative information about DNA mixtures with different molecular lengths is important. Gene therapy vectors are much longer than can be sequenced individually via short-read NGS. However, vector preparations may contain smaller DNAs that behave differently during sequencing. We have used two library preparations each for Pacific Biosystems (PacBio) and Oxford Nanopore Technologies NGS to determine their suitability for quantitative assessment of varying sized DNAs. Equimolar length standards were generated from E. coli genomic DNA. Both PacBio library preparations provided a consistent length dependence though with a complex pattern. This method is sufficiently sensitive that differences in genomic copy number between DNA from E. coli grown in exponential and stationary phase conditions could be detected. The transposase-based Oxford Nanopore library preparation provided a predictable length dependence, but the random sequence starts caused the loss of original length information. The ligation-based approach retained length information but read frequency was more variable. Modeling of E. coli versus lambda read frequency via cubic spline smoothing showed that the shorter genome could be used as a suitable internal spike-in for DNAs in the 200 bp to 10 kb range, allowing meaningful QC to be carried out with AAV preparations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.