Microsatellites are multi-allelic and composed of short tandem repeats (STRs) with individual motifs composed of mononucleotides, dinucleotides or higher including hexamers. Next-generation sequencing approaches and other STR assays rely on a limited number of PCR amplicons, typically in the tens. Here, we demonstrate STR-Seq, a next-generation sequencing technology that analyses over 2,000 STRs in parallel, and provides the accurate genotyping of microsatellites. STR-Seq employs in vitro CRISPR–Cas9-targeted fragmentation to produce specific DNA molecules covering the complete microsatellite sequence. Amplification-free library preparation provides single molecule sequences without unique molecular barcodes. STR-selective primers enable massively parallel, targeted sequencing of large STR sets. Overall, STR-Seq has higher throughput, improved accuracy and provides a greater number of informative haplotypes compared with other microsatellite analysis approaches. With these new features, STR-Seq can identify a 0.1% minor genome fraction in a DNA mixture composed of different, unrelated samples.
The human genome is composed of two haplotypes, otherwise called diplotypes, which denote phased polymorphisms and structural variations (SVs) that are derived from both parents. Diplotypes place genetic variants in the context of cis-related variants from a diploid genome. As a result, they provide valuable information about hereditary transmission, context of SV, regulation of gene expression and other features which are informative for understanding human genetics. Successful diplotyping with short read whole genome sequencing generally requires either a large population or parent-child trio samples. To overcome these limitations, we developed a targeted sequencing method for generating megabase (Mb)-scale haplotypes with short reads. One selects specific 0.1–0.2 Mb high molecular weight DNA targets with custom-designed Cas9–guide RNA complexes followed by sequencing with barcoded linked reads. To test this approach, we designed three assays, targeting the BRCA1 gene, the entire 4-Mb major histocompatibility complex locus and 18 well-characterized SVs, respectively. Using an integrated alignment- and assembly-based approach, we generated comprehensive variant diplotypes spanning the entirety of the targeted loci and characterized SVs with exact breakpoints. Our results were comparable in quality to long read sequencing.
Background The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual patients. In combination, these sets of viral mutations provide distinct genetic fingerprints that reveal the patterns of transmission and have utility in contact tracing. Methods Leveraging thousands of sequenced SARS-CoV-2 genomes, we performed a viral pangenome analysis to identify conserved genomic sequences. We used a rapid and highly efficient computational approach that relies on k-mers, short tracts of sequence, instead of conventional sequence alignment. Using this method, we annotated viral mutation signatures that were associated with specific strains. Based on these highly conserved viral sequences, we developed a rapid and highly scalable targeted sequencing assay to identify mutations, detect quasispecies variants, and identify mutation signatures from patients. These results were compared to the pangenome genetic fingerprints. Results We built a k-mer index for thousands of SARS-CoV-2 genomes and identified conserved genomics regions and landscape of mutations across thousands of virus genomes. We delineated mutation profiles spanning common genetic fingerprints (the combination of mutations in a viral assembly) and a combination of mutations that appear in only a small number of patients. We developed a targeted sequencing assay by selecting primers from the conserved viral genome regions to flank frequent mutations. Using a cohort of 100 SARS-CoV-2 clinical samples, we identified genetic fingerprints consisting of strain-specific mutations seen across populations and de novo quasispecies mutations localized to individual infections. We compared the mutation profiles of viral samples undergoing analysis with the features of the pangenome. Conclusions We conducted an analysis for viral mutation profiles that provide the basis of genetic fingerprints. Our study linked pangenome analysis with targeted deep sequenced SARS-CoV-2 clinical samples. We identified quasispecies mutations occurring within individual patients and determined their general prevalence when compared to over 70,000 other strains. Analysis of these genetic fingerprints may provide a way of conducting molecular contact tracing.
Small RNAs that originate from transfer RNA (tRNA) species, tRNA-derived fragments (tRFs), play diverse biological functions but little is known for their association with aging. Moreover, biochemical aspects of tRNAs limit discovery of functional tRFs by high throughput sequencing. In particular, genes encoding tRNAs exist as multiple copies throughout genome, and mature tRNAs have various modified bases, contributing to ambiguities for RNA sequencing-based analysis of tRFs. Here, we report age-dependent changes of tRFs in Caenorhabditis elegans. We first analyzed published RNA sequencing data by using a new strategy for tRNA-associated sequencing reads. Our current method used unique mature tRNAs as a reference for the sequence alignment, and properly filtered out false positive enrichment for tRFs. Our analysis successfully distinguished de novo mutation sites from differences among homologous copies, and identified potential RNA modification sites. Overall, the majority of tRFs were upregulated during aging and originated from 5′-ends, which we validated by using Northern blot analysis. Importantly, we revealed that the major source of tRFs upregulated during aging was the tRNAs with abundant gene copy numbers. Our analysis suggests that tRFs are useful biomarkers of aging particularly when they originate from abundant homologous gene copies.
We developed a sensitive sequencing approach that simultaneously profiles microsatellite instability, chromosomal instability, and subclonal structure in cancer. We assessed diverse repeat motifs across 225 microsatellites on colorectal carcinomas. Our study identified elevated alterations at both selected tetranucleotide and conventional mononucleotide repeats. Many colorectal carcinomas had a mix of genomic instability states that are normally considered exclusive. An MSH3 mutation may have contributed to the mixed states. Increased copy number of chromosome arm 8q was most prevalent among tumors with microsatellite instability, including a case of translocation involving 8q. Subclonal analysis identified co-occurring driver mutations previously known to be exclusive.
Background: The genome of SARS-CoV-2 is susceptible to mutations during viral replication due to the errors generated by RNA-dependent RNA polymerases. These mutations enable the SARS-CoV-2 to evolve into new strains. Viral quasispecies emerge from de novo mutations that occur in individual patients. In combination, these sets of viral mutations provide distinct genetic fingerprints that reveal the patterns of transmission and have utility in contract tracing. Methods: Leveraging thousands of sequenced SARS-CoV-2 genomes, we performed a viral pangenome analysis to identify conserved genomic sequences. We used a rapid and highly efficient computational approach that relies on k-mers, short tracts of sequence, instead of conventional sequence alignment. Using this method, we annotated viral mutation signatures that were associated with specific strains. Based on these highly conserved viral sequences, we developed a rapid and highly scalable targeted sequencing assay to identify mutations, detect quasispecies and identify mutation signatures from patients. These results were compared to the pangenome genetic fingerprints. Results: We built a k-mer index for thousands of SARS-CoV-2 genomes and identified conserved genomics regions and landscape of mutations across thousands of virus genomes. We delineated mutation profiles spanning common genetic fingerprints (the combination of mutations in a viral assembly) and rare ones that occur in only small fraction of patients. We developed a targeted sequencing assay by selecting primers from the conserved viral genome regions to flank frequent mutations. Using a cohort of SARS-CoV-2 clinical samples, we identified genetic fingerprints consisting of strain-specific mutations seen across populations and de novo quasispecies mutations localized to individual infections. We compared the mutation profiles of viral samples undergoing analysis with the features of the pangenome. Conclusions: We conducted an analysis for viral mutation profiles that provide the basis of genetic fingerprints. Our study linked pangenome analysis with targeted deep sequenced SARS-CoV-2 clinical samples. We identified quasispecies mutations occurring within individual patients, mutations demarcating dominant species and the prevalence of mutation signatures, of which a significant number were relatively unique. Analysis of these genetic fingerprints may provide a way of conducting molecular contact tracing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.