Tumor Haplotype Assembly Algorithms for Cancer Genomics

Aguiar, Derek; Wong, Wendy S. W.; Istrail, Sorin

doi:10.1142/9789814583220_0002

Cited by 4 publications

(7 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The alignments are pre-processed to generate BAM files and remove duplicates by samtools [49] and Picardtools [50], after which SNPs are called using FreeBayes [21] (Figure 1-B). The processed alignments, the reference and the VCF files are used in the haplotyping step by HapCompass [39,43], HapTree [44] and SDhaP [40] to estimate the haplotypes using the phasing information from reads with at least two heterozygous SNPs (Figure 1-C). In the last step, the obtained estimates are compared to the original haplotypes by command-line tool hapcompare that we developed using several measures of estimation quality (Figure 1-D).…”

Section: Methodsmentioning

confidence: 99%

“…Here we review three state-of-the-art haplotyping algorithms applicable to polyploids: HapCompass [39,43], HapTree [44] and SDhaP [40], and evaluate their accuracy through extensive simulations of random genomes and NGS reads. Using the highly heterozygous tetraploid potato (S. tuberosum) as a model, we generated random genomes using a realistic stochastic model with parameters SNP density and distribution of SNP dosages, i.e.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study

Motazedi

Finkers²,

Maliepaard³

et al. 2017

Brief Bioinform

View full text Add to dashboard Cite

Haplotypes are the units of inheritance in an organism, and many genetic analyses depend on their precise determination. Methods for haplotyping single individuals use the phasing information available in next-generation sequencing reads, by matching overlapping single-nucleotide polymorphisms while penalizing post hoc nucleotide corrections made. Haplotyping diploids is relatively easy, but the complexity of the problem increases drastically for polyploid genomes, which are found in both model organisms and in economically relevant plant and animal species. Although a number of tools are available for haplotyping polyploids, the effects of the genomic makeup and the sequencing strategy followed on the accuracy of these methods have hitherto not been thoroughly evaluated.We developed the simulation pipeline haplosim to evaluate the performance of three haplotype estimation algorithms for polyploids: HapCompass, HapTree and SDhaP, in settings varying in sequencing approach, ploidy levels and genomic diversity, using tetraploid potato as the model. Our results show that sequencing depth is the major determinant of haplotype estimation quality, that 1 kb PacBio circular consensus sequencing reads and Illumina reads with large insert-sizes are competitive and that all methods fail to produce good haplotypes when ploidy levels increase. Comparing the three methods, HapTree produces the most accurate estimates, but also consumes the most resources. There is clearly room for improvement in polyploid haplotyping algorithms.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study

Motazedi

Finkers²,

Maliepaard³

et al. 2017

Brief Bioinform

View full text Add to dashboard Cite

show abstract

Section: Methodsmentioning

confidence: 99%

“…For simplicity, we focus on the most prevalent type SNPs, bi-allelic SNPs, for which the alleles can be represented by '0' (the reference) and '1' (the alternative). a) HapCompass: Aguiar and Istrail (2013) extend their graphical haplotype estimation approach for diploids [39], by constructing the polyploid Compass graph, which has k nodes for each variant site, s i , of a k-ploid corresponding to the k alleles at that site [43]. To each SNP pair, s i , s j , that is covered by at least one of the m fragments, the phasing with the largest likelihood is assigned by a polyploid likelihood model, conditional on the covering fragments and assuming a fixed base calling error rate.…”

Section: Methodsmentioning

confidence: 99%

Exploiting Next Generation Sequencing to Solve the Haplotyping Puzzle in Polyploids: A Simulation Study

Motazedi

Finkers

Maliepaard

et al. 2016

Preprint

View full text Add to dashboard Cite

Haplotypes are the units of inheritance in an organism, and many genetic analyses depend on their precise determination. Methods for haplotyping single individuals use the phasing information available in Next Generation Sequencing reads, by matching overlapping SNPs while penalizing post hoc nucleotide corrections made. Haplotyping diploids is relatively easy, but the complexity of the problem increases drastically for polyploid genomes, which are found in both model organisms and in economically relevant plant and animal species. While a number of tools are available for haplotyping polyploids, the effects of the genomic makeup and the sequencing strategy followed on the accuracy of these methods have hitherto not been thoroughly evaluated.We developed the simulation pipeline haplosim to evaluate the performance of haplotype estimation algorithms for polyploids: HapCompass, HapTree and SDhaP, in settings varying in sequencing approach, ploidy levels and genomic diversity, using tetraploid potato as the model. Our results show that sequencing depth is the major determinant of haplotype estimation quality, that 1kb PacBio CCS reads and Illumina reads with large insert-sizes are competitive, and that all methods fail to produce good haplotypes when ploidy levels increase. Comparing the three methods, HapTree produces the most accurate estimates, but also consumes the most resources. There is clearly room for improvement in polyploid haplotyping algorithms. AUTHOR CONTRIBUTIONS RF, CM, EM and DdR designed the study, revised and approved the manuscript. EM developed the simulation pipeline and performed the analyses.

show abstract

“…In recent years, the CRISPR/Cas9-based negative screening strategy with high-throughput random screening has been applied to identify potential drug resistance mutations. Due to the limits of library construction, the induced mutation is identified within a limited region of the gene sequence, while randomly induced mutations are rarely observed in clinical samples and have limited clinical validity ( 52 – 54 ). The CRISPR/Cas9-based saturation mutation strategy to identify the drug resistance pathway hub gene displays high efficiency and clinical utility.…”

Section: Discussionmentioning

confidence: 99%

CRISPR/Cas9‑induced saturated mutagenesis identifies Rad51 haplotype as a marker of PARP inhibitor sensitivity in breast cancer

et al. 2022

View full text Add to dashboard Cite

Breast cancer treatment with poly(ADP-ribose)polymerase (PARP) inhibitors is currently limited to cells defective in the homologous recombination repair (HRR) pathway. The chemical inhibition of many HRR deficiency genes may sensitize cancer cells to PARP inhibitors. In the present study, Rad51 , a central player in the HRR pathway, was selected to explore additional low variation and highly representative markers for PARP inhibitor activity. A CRISPR/Cas9-based saturated mutation approach for the Rad51 WALKER domain was used to evaluate the sensitivity of the PARP inhibitor olaparib. Five amino acid mutation sites were identified in olaparib-resistant cells. Two Rad51 haplotypes were assembled from the mutations, and may represent useful pharmacogenomic markers of PARP inhibitor sensitivity.

show abstract

Tumor Haplotype Assembly Algorithms for Cancer Genomics

Cited by 4 publications

References 23 publications

Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study

Exploiting next-generation sequencing to solve the haplotyping puzzle in polyploids: a simulation study

Exploiting Next Generation Sequencing to Solve the Haplotyping Puzzle in Polyploids: A Simulation Study

CRISPR/Cas9‑induced saturated mutagenesis identifies Rad51 haplotype as a marker of PARP inhibitor sensitivity in breast cancer

Contact Info

Product

Resources

About