Expansions of short tandem repeats are genetic variants that have been20 implicated in neuropsychiatric and other disorders but their assessment 21 remains challenging with current molecular methods. Here, we developed a 22 Cas12a-based enrichment strategy for nanopore sequencing that, combined 23 with a new algorithm for raw signal analysis, enables us to efficiently target, 24 sequence and precisely quantify repeat numbers as well as their DNA 25 methylation status. Taking advantage of these single molecule nanopore 26 signals provides therefore unprecedented opportunities to study pathological 27 repeat expansions. 28 The expansion of unstable genomic Short Tandem Repeats (STRs) causes more 29 than 30 Mendelian human disorders 1 . For example, expansion of a GGGGCC-repeat 30 [(G 4 C 2 ) n ] within the C9orf72 gene is the most frequent monogenic cause of 31 Frontotemporal Dementia (FTD) and Amyotrophic Lateral Sclerosis (ALS; 32 c9FTD/ALS; OMIM: # 105550) 2,3 . Similarly, accumulation of a CGG motif in the 33 FMR1 gene underlies the Fragile X Syndrome (FXS; OMIM # 300624), currently the 34 most common identifiable genetic cause of mental retardation and autism 4 . In both 35 prototypical repeat expansion disorders (Suppl. Discussion 1), recent evidence has 36 suggested pronounced inter-and intraindividual repeat variability as well as changes 37 in DNA methylation of the respective genomic regions to modulate disease 38 phenotype 5-8 . 39 To overcome current difficulties in characterizing expanded STRs (Suppl. Discussion 40 2) most notably we focused on three areas: i) optimization of Nanopore sequencing 41 and signal processing to capture STRs ii) development and implementation of a 42 3 target enrichment strategy to increase efficiency and iii) integration of expansion 43 measurements with DNA methylation of the same molecule. 44 45 Figure1 nanoSTRique: Generic repeat detection pipeline on raw nanopore signals. 46 a) Repeat quantification by signal-alignment of flanking prefix and suffix regions and HMM based 47 count on signal of interest. b) BioAnalyzer electropherogram, decoy alignment, RepeatHMM and 48 nanoSTRique counts of synthetic (G 4 C 2 ) n repeats (10k random reads per barcode, +/-10 % intervals 49 around expected repeat length). c) Nanopore sequencing and analysis of BAC clone 239 from a 50 c9ALS/FTD patient compared to cropped corresponding lane from Ref. 15 for illustration purpose. d) 51manual confirmation of detected repeat counts in synthetic repeats (n=16, 50, 49, 49, 47). 52 First, for benchmarking repeat expansion counting methods we constructed, verified 53 and nanopore sequenced plasmids with several synthetic (G 4 C 2 ) n -repeat lengths 9 . 54 We analyzed our results with currently available STR quantification pipelines 10,11 but 55 found those methods to become unreliable for more than 32 (G 4 C 2 ) n -repeats with 56 nanopore reads. To further improve the repeat analysis we developed a signal 57 processing algorithm for a more exact quantification of STR numbers in raw 58 na...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.