Single molecule molecular inversion probes for targeted, high-accuracy detection of low-frequency variation

Hiatt, Joseph; Pritchard, Colin C.; Salipante, Stephen J.; O’Roak, Brian J.; Shendure, Jay

doi:10.1101/gr.147686.112

Cited by 309 publications

(388 citation statements)

References 55 publications

(84 reference statements)

Supporting

Mentioning

382

Contrasting

Order By: Relevance

“…1A; Hiatt et al 2013). In Col-0, we successfully captured all 102 STR target loci (Supplemental Fig.…”

Section: Resultsmentioning

confidence: 99%

“…We designed MIPs to capture 200 bp, a larger target size than is typically used (Porreca et al 2007;Turner et al 2009;O'Roak et al 2012;Hiatt et al 2013), to increase the size range of targeted STRs. This size range encompasses the majority of STRs (Gymrek et al 2012), except those that have undergone extreme expansion as seen in some human diseases and for the intronic STR in the A. thaliana IIL1 gene (Gatchel and Zoghbi 2005;Sureshkumar et al 2009).…”

Section: Resultsmentioning

confidence: 99%

“…We used single-molecule Molecular Inversion Probes (smMIPs) (Hiatt et al 2013) to capture STRs, thereby maximizing the number of STR-spanning, informative reads. In a proof-of-principle experiment, we targeted 102 STRs across the genome of the model (Sullivan et al 2014), and intergenic tri-and hexa-nucleotide STRs (Supplemental Fig.…”

Section: Resultsmentioning

confidence: 99%

“…MIPSTR combines STR capture via single-molecule Molecular Inversion Probes (smMIPs) (Hiatt et al 2013) with midsize sequencing reads (250 bp) and a unique mapping strategy. In proof-of-principle experiments, we captured and sequenced STRs genome-wide in diverse A. thaliana populations, called germline STR genotypes with high accuracy, and quantified technical error with singlemolecule information.…”

Section: [Supplemental Materials Is Available For This Article]mentioning

confidence: 99%

See 3 more Smart Citations

MIPSTR: a method for multiplex genotyping of germline and somatic STR variation across many individuals

et al. 2015

Self Cite

View full text Add to dashboard Cite

Short tandem repeats (STRs) are highly mutable genetic elements that often reside in regulatory and coding DNA. The cumulative evidence of genetic studies on individual STRs suggests that STR variation profoundly affects phenotype and contributes to trait heritability. Despite recent advances in sequencing technology, STR variation has remained largely inaccessible across many individuals compared to single nucleotide variation or copy number variation. STR genotyping with short-read sequence data is confounded by (1) the difficulty of uniquely mapping short, low-complexity reads; and (2) the high rate of STR amplification stutter. Here, we present MIPSTR, a robust, scalable, and affordable method that addresses these challenges. MIPSTR uses targeted capture of STR loci by single-molecule Molecular Inversion Probes (smMIPs) and a unique mapping strategy. Targeted capture and our mapping strategy resolve the first challenge; the use of single molecule information resolves the second challenge. Unlike previous methods, MIPSTR is capable of distinguishing technical error due to amplification stutter from somatic STR mutations. In proof-of-principle experiments, we use MIPSTR to determine germline STR genotypes for 102 STR loci with high accuracy across diverse populations of the plant A. thaliana. We show that putatively functional STRs may be identified by deviation from predicted STR variation and by association with quantitative phenotypes. Using DNA mixing experiments and a mutant deficient in DNA repair, we demonstrate that MIPSTR can detect low-frequency somatic STR variants. MIPSTR is applicable to any organism with a high-quality reference genome and is scalable to genotyping many thousands of STR loci in thousands of individuals.

show abstract

“…1A; Hiatt et al 2013). In Col-0, we successfully captured all 102 STR target loci (Supplemental Fig.…”

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

Section: Resultsmentioning

confidence: 99%

Section: [Supplemental Materials Is Available For This Article]mentioning

confidence: 99%

See 2 more Smart Citations

MIPSTR: a method for multiplex genotyping of germline and somatic STR variation across many individuals

et al. 2015

Self Cite

View full text Add to dashboard Cite

show abstract

“…There are several protocols in which a sequence of random nucleotides is appended to the template molecules before amplification and sequencing. This idea has been applied under a variety of names to identify PCR duplicates (1,2), improve counting of DNA (3,4) and RNA (5-7) templates, and reduce sequence error (8)(9)(10). Each implementation has its own name for the random nucleotide sequences, and we refer to them as varietal tags (11).…”

mentioning

confidence: 99%

Facilitated sequence counting and assembly by template mutagenesis

Levy

Wigler

2014

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

Presently, inferring the long-range structure of the DNA templates is limited by short read lengths. Accurate template counts suffer from distortions occurring during PCR amplification. We explore the utility of introducing random mutations in identical or nearly identical templates to create distinguishable patterns that are inherited during subsequent copying. We simulate the applications of this process under assumptions of error-free sequencing and perfect mapping, using cytosine deamination as a model for mutation. The simulations demonstrate that within readily achievable conditions of nucleotide conversion and sequence coverage, we can accurately count the number of otherwise identical molecules as well as connect variants separated by long spans of identical sequence. We discuss many potential applications, such as transcript profiling, isoform assembly, haplotype phasing, and de novo genome assembly.mutational tagging | expression profiling | copy number variation S ome problems in genomic analysis have remained difficult despite the development of high throughput sequencing methods. Many of these problems arise from the inability to distinguish identical and nearly identical template sequences. Counting molecules of identical composition in an RNA sequencing assay or the copy number of identical stretches of DNA currently depend on quantitative methods that adjust imperfectly for the distortions of data caused by sample processing. Moreover, because read lengths are short, determining the physical connection of distinguishable elements separated by long identical stretches is difficult to impossible and limits our ability to phase single nucleotide variants (SNVs), identify transcript isoforms, and assemble through repetitive genomic regions. We propose a method that solves these problems by randomly mutagenizing the original template molecules. Each template thus bears a unique signature that is imprinted on all of its subsequent copies and the fragments of those copies. Counting molecules becomes a matter of counting unique mutational patterns and assembly a matter of connecting reads with overlapping mutation patterns.Modifying molecules to facilitate counting is not a new idea. There are several protocols in which a sequence of random nucleotides is appended to the template molecules before amplification and sequencing. This idea has been applied under a variety of names to identify PCR duplicates (1, 2), improve counting of DNA (3, 4) and RNA (5-7) templates, and reduce sequence error (8-10). Each implementation has its own name for the random nucleotide sequences, and we refer to them as varietal tags (11). Counting varietal tags serves the same role as counting unique mutational signatures, mitigating the effects of amplification bias. The advantage of tagging over mutation is that the original message is completely recoverable. The disadvantage is that the tag is confined to one end of the molecule such that identity and count can only be distinguished within one read length of the ends. Furthe...

show abstract

Contribution of ultrarare variants in mTOR pathway genes to sporadic focal epilepsies

Pippucci

Bisulli

Baldassari

et al. 2019

Ann Clin Transl Neurol

View full text Add to dashboard Cite

Objective We investigated the contribution to sporadic focal epilepsies (FE) of ultrarare variants in genes coding for the components of complexes regulating mechanistic Target Of Rapamycin (mTOR)complex 1 (mTORC1). Methods We collected genetic data of 121 Italian isolated FE cases and 512 controls by Whole Exome Sequencing (WES) and single‐molecule Molecular Inversion Probes (smMIPs) targeting 10 genes of the GATOR1, GATOR2, and TSC complexes. We collapsed “qualifying” variants (ultrarare and predicted to be deleterious or loss of function) across the examined genes and sought to identify their enrichment in cases compared to controls. Results We found eight qualifying variants in cases and nine in controls, demonstrating enrichment in FE patients (P = 0.006; exact unconditional test, one‐tailed). Pathogenic variants were identified in DEPDC5 and TSC2, both major genes for Mendelian FE syndromes. Interpretation Our findings support the contribution of ultrarare variants in genes in the mTOR pathway complexes GATOR and TSC to the risk of sporadic FE and a shared genetic basis between rare and common epilepsies. The identification of a monogenic etiology in isolated cases, most typically encountered in clinical practice, may offer to a broader community of patients the perspective of precision therapies directed by the underlying genetic cause.

show abstract

Single molecule molecular inversion probes for targeted, high-accuracy detection of low-frequency variation

Cited by 309 publications

References 55 publications

MIPSTR: a method for multiplex genotyping of germline and somatic STR variation across many individuals

MIPSTR: a method for multiplex genotyping of germline and somatic STR variation across many individuals

Facilitated sequence counting and assembly by template mutagenesis

Contribution of ultrarare variants in mTOR pathway genes to sporadic focal epilepsies

Contact Info

Product

Resources

About