2020
DOI: 10.1101/2020.10.24.353755
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Fast Data-Driven Method for Genotype Imputation, Phasing, and Local Ancestry Inference: MendelImpute.jl

Abstract: Current methods for genotype imputation and phasing exploit the sheer volume of data in haplotype reference panels and rely on hidden Markov models. Existing programs all have essentially the same imputation accuracy, are computationally intensive, and generally require pre-phasing the typed markers. We propose a novel data-mining method for genotype imputation and phasing that substitutes highly efficient linear algebra routines for hidden Markov model calculations. This strategy, embodied in our Julia progra… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 33 publications
0
1
0
Order By: Relevance
“…Taken together, these curated variant datasets enable an alternative class of models to be used to predict ancestry based upon samples labeled with known ancestry [18][19][20][21][22][23][24][25][26][27][28] . However, many methods suffer shortcomings, including not having discrete ancestry labels beyond the main continental groups or, for those methods using the 1kGP, not considering that many subjects are within the same families and, therefore, fail to satisfy the principle of independent and identically distributed data.…”
Section: Introductionmentioning
confidence: 99%
“…Taken together, these curated variant datasets enable an alternative class of models to be used to predict ancestry based upon samples labeled with known ancestry [18][19][20][21][22][23][24][25][26][27][28] . However, many methods suffer shortcomings, including not having discrete ancestry labels beyond the main continental groups or, for those methods using the 1kGP, not considering that many subjects are within the same families and, therefore, fail to satisfy the principle of independent and identically distributed data.…”
Section: Introductionmentioning
confidence: 99%