Irina Shagina scite author profile

A comprehensive understanding of cancer is predicated upon knowledge of the structure of malignant genomes underlying its many variant forms and the molecular mechanisms giving rise to them. It is well established that solid tumor genomes accumulate a large number of genome rearrangements during tumorigenesis. End Sequence Profiling (ESP) maps and clones genome breakpoints associated with all types of genome rearrangements elucidating the structural organization of tumor genomes. Here we extend the ESP methodology in several directions using the breast cancer cell line MCF-7. First, targeted ESP is applied to multiple amplified loci, revealing a complex process of rearrangement and coamplification in these regions reminiscent of breakage/fusion/bridge cycles. Second, genome breakpoints identified by ESP are confirmed using a combination of DNA sequencing and PCR. Third, in vitro functional studies assign biological function to a rearranged tumor BAC clone, demonstrating that it encodes antiapoptotic activity. Finally, ESP is extended to the transcriptome identifying four novel fusion transcripts and providing evidence that expression of fusion genes may be common in tumors. These results demonstrate the distinct advantages of ESP including: (1) the ability to detect all types of rearrangements and copy number changes; (2) straightforward integration of ESP data with the annotated genome sequence; (3) immortalization of the genome; (4) ability to generate tumor-specific reagents for in vitro and in vivo functional studies. Given these properties, ESP could play an important role in a tumor genome project.

show abstract

MAGERI: Computational pipeline for molecular-barcoded targeted resequencing

Shugay

Zaretsky

Shagin

et al. 2017

PLoS Comput Biol

View full text Add to dashboard Cite

Unique molecular identifiers (UMIs) show outstanding performance in targeted high-throughput resequencing, being the most promising approach for the accurate identification of rare variants in complex DNA samples. This approach has application in multiple areas, including cancer diagnostics, thus demanding dedicated software and algorithms. Here we introduce MAGERI, a computational pipeline that efficiently handles all caveats of UMI-based analysis to obtain high-fidelity mutation profiles and call ultra-rare variants. Using an extensive set of benchmark datasets including gold-standard biological samples with known variant frequencies, cell-free DNA from tumor patient blood samples and publicly available UMI-encoded datasets we demonstrate that our method is both robust and efficient in calling rare variants. The versatility of our software is supported by accurate results obtained for both tumor DNA and viral RNA samples in datasets prepared using three different UMI-based protocols.

show abstract

Normalization of Genomic DNA Using Duplex-Specific Nuclease

Shagina¹,

Богданова

Mamedov

et al. 2010

BioTechniques

View full text Add to dashboard Cite

An application of duplex-specific nuclease (DSN) normalization technology to whole-genome shotgun sequencing of genomes with a large proportion of repetitive DNA is described. The method uses a thermostable DSN from the Kamchatka crab that specifically hydrolyzes dsDNA. In model experiments on human genomic DNA, we demonstrated that DSN normalization of double-stranded DNA formed during C0t analysis is effective against abundant repetitive sequences with high sequence identity, while retaining highly divergent repeats and coding regions at base-line levels. Thus, DSN normalization applied to C0t analysis can be used to eliminate evolutionarily young repetitive elements from genomic DNA before sequencing, and should prove invaluable in studies of large eukaryotic genomes, such as those of higher plants.

show abstract

A method for the preparation of normalized cDNA libraries enriched with full-length sequences

et al. 2005

View full text Add to dashboard Cite

We developed a new method for the preparation of normalized cDNA libraries enriched with full-length sequences. It is based on the properties of the recently characterized duplex-specific nuclease from the hepatopancreas of the Kamchatka crab. The duplex-specific nuclease is thermostable, it effectively cleaves double-stranded DNA and is inactive toward single-stranded DNA (Shagin et al., Genome Res., 2002, vol. 12, pp. 1935-1942). Our method enables the normalization of cDNA samples enriched with full-length sequences without use of laborious and ineffective stages of physical separation. The efficiency of the method was demonstrated in model experiments using cDNA samples from several human tissues.

show abstract

A high-throughput assay for quantitative measurement of PCR errors

Shagin

Shagina

Zaretsky

et al. 2017

Sci Rep

View full text Add to dashboard Cite

The accuracy with which DNA polymerase can replicate a template DNA sequence is an extremely important property that can vary by an order of magnitude from one enzyme to another. The rate of nucleotide misincorporation is shaped by multiple factors, including PCR conditions and proofreading capabilities, and proper assessment of polymerase error rate is essential for a wide range of sensitive PCR-based assays. In this paper, we describe a method for studying polymerase errors with exceptional resolution, which combines unique molecular identifier tagging and high-throughput sequencing. Our protocol is less laborious than commonly-used methods, and is also scalable, robust and accurate. In a series of nine PCR assays, we have measured a range of polymerase accuracies that is in line with previous observations. However, we were also able to comprehensively describe individual errors introduced by each polymerase after either 20 PCR cycles or a linear amplification, revealing specific substitution preferences and the diversity of PCR error frequency profiles. We also demonstrate that the detected high-frequency PCR errors are highly recurrent and that the position in the template sequence and polymerase-specific substitution preferences are among the major factors influencing the observed PCR error rate.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Irina Shagina

Decoding the fine-scale structure of a breast cancer genome and transcriptome

MAGERI: Computational pipeline for molecular-barcoded targeted resequencing

Normalization of Genomic DNA Using Duplex-Specific Nuclease

A method for the preparation of normalized cDNA libraries enriched with full-length sequences

A high-throughput assay for quantitative measurement of PCR errors

Contact Info

Product

Resources

About