2014
DOI: 10.1186/1471-2105-15-164
|View full text |Cite
|
Sign up to set email alerts
|

WinHAP2: an extremely fast haplotype phasing program for long genotype sequences

Abstract: BackgroundThe haplotype phasing problem tries to screen for phenotype associated genomic variations from millions of candidate data. Most of the current computer programs handle this problem with high requirements of computing power and memory. By replacing the computation-intensive step of constructing the maximum spanning tree with a heuristics of estimated initial haplotype, we released the WinHAP algorithm version 1.0, which outperforms the other algorithms in terms of both running speed and overall accura… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 30 publications
0
2
0
Order By: Relevance
“…Currently, most state-of-the-art algorithms employ the alignment-based strategy to determine adjacent loci based on Illumina short reads and/or PacBio long reads [8], such as HAPCUT2 [9], WHATSHAP [14] and WinHAP2 [41]. However, phasing with Illumina short reads often produce limited length of phased blocks since the distance of adjacent polymorphic sites can exceed the length of a typical Illumina read or read pair.…”
Section: Discussionmentioning
confidence: 99%
“…Currently, most state-of-the-art algorithms employ the alignment-based strategy to determine adjacent loci based on Illumina short reads and/or PacBio long reads [8], such as HAPCUT2 [9], WHATSHAP [14] and WinHAP2 [41]. However, phasing with Illumina short reads often produce limited length of phased blocks since the distance of adjacent polymorphic sites can exceed the length of a typical Illumina read or read pair.…”
Section: Discussionmentioning
confidence: 99%
“…Sequencing technologies continue to decrease in cost, with the result that it is now feasible to sequence up to tens of thousands of taxa in multiple genes, at least in viruses. As the computational challenges associated with haplotype phasing are resolved ( Pan et al 2014 ; Zhi and Zhang 2014 ; Regan et al 2015 ), phylogenetic methods will be used for large data sets on higher organisms. As an example, the Epidemiology Network Ag1000G has 765 Anopheles mosquito genomes visible to the public ( MalariaGEN 2015 ).…”
Section: Discussionmentioning
confidence: 99%