2013
DOI: 10.1186/1471-2105-14-s15-s13
|View full text |Cite
|
Sign up to set email alerts
|

Haploid to diploid alignment for variation calling assessment

Abstract: MotivationVariation calling is the process of detecting differences between donor and consensus DNA via high-throughput sequencing read mapping. When evaluating the performance of different variation calling methods, a typical scenario is to simulate artificial (diploid) genomes and sample reads from those. After variation calling, one can then compute precision and recall statistics. This works reliably on SNPs but on larger indels there is the problem of invariance: a predicted deletion/insertion can differ … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2014
2014
2016
2016

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 5 publications
0
5
0
Order By: Relevance
“…The alignment score gives a ranking for deletion prediction in the case that one of the sets corresponds to the ground truth. This approach was recently extended to the diploid ground-truth setting (Mäkinen and Rahkola, 2013) with an O(dn) algorithm, where d is the unit cost edit distance between predicted sequence and ground-truth, and n is the maximum sequence length. In practice, this approach is not scalable to whole-genome comparisons (Mäkinen and Rahkola, 2013); computation took 6 to 11.5 hours, depending on prediction accuracy, only for the 63 Mbp long human chromosome 20.…”
Section: Related Workmentioning
confidence: 99%
“…The alignment score gives a ranking for deletion prediction in the case that one of the sets corresponds to the ground truth. This approach was recently extended to the diploid ground-truth setting (Mäkinen and Rahkola, 2013) with an O(dn) algorithm, where d is the unit cost edit distance between predicted sequence and ground-truth, and n is the maximum sequence length. In practice, this approach is not scalable to whole-genome comparisons (Mäkinen and Rahkola, 2013); computation took 6 to 11.5 hours, depending on prediction accuracy, only for the 63 Mbp long human chromosome 20.…”
Section: Related Workmentioning
confidence: 99%
“…Another application is possible in variant calling evaluation [7], as covering alignment takes heterozygous variations properly into account: High-throughput sequencing allows a cost-effective way to discover how an individual genome differs from the consensus reference genome of the species. The result of such variant calling process is a list of homozygous and heterozygous variant predictions.…”
Section: Introductionmentioning
confidence: 99%
“…Because of that, in [7] all predicted variants are applied to the reference in order to create a predicted haploid genome . Edit distance between the predicted haploid and the ground-truth diploid was then computed, allowing arbitrary recombinations for the haploid to distribute along the diploid, giving a distance measure.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations