2023
DOI: 10.1093/nar/gkad149
|View full text |Cite
|
Sign up to set email alerts
|

Rye: genetic ancestry inference at biobank scale

Abstract: Biobank projects are generating genomic data for many thousands of individuals. Computational methods are needed to handle these massive data sets, including genetic ancestry (GA) inference tools. Current methods for GA inference do not scale to biobank-size genomic datasets. We present Rye—a new algorithm for GA inference at biobank scale. We compared the accuracy and runtime performance of Rye to the widely used RFMix, ADMIXTURE and iAdmix programs and applied it to a dataset of 488221 genome-wide variant sa… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 37 publications
0
5
0
Order By: Relevance
“…Considering these results, it is, in principle, possible to use PC coordinates to infer admixture proportions of a target population using a set of sources. Different attempts and approaches have recently been proposed using principal components (Conley et al, 2023).…”
Section: F-statistics Results Broadly Recapitulate Genetic Relationsh...mentioning
confidence: 99%
See 3 more Smart Citations
“…Considering these results, it is, in principle, possible to use PC coordinates to infer admixture proportions of a target population using a set of sources. Different attempts and approaches have recently been proposed using principal components (Conley et al, 2023).…”
Section: F-statistics Results Broadly Recapitulate Genetic Relationsh...mentioning
confidence: 99%
“…We compared ASAP with qpAdm, Rye, and Unlinked-ChromoPainter NNLS, which harness f4-statistics, PCA, and a modified Li and Stephens model with infinite recombination between SNPs for the ancestry composition inference, respectively (Conley et al, 2023; Haak et al, 2015; Harney et al, 2021; Li and Stephens, 2003). We compared the accuracy in estimating the ancestral proportions of the four approaches using the pseudo-haploid genomes of both the target admixed samples and the true sources of the admixture.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…PC data from All of Us samples and the HGDP and 1000 Genomes samples were used to compute individual participant genetic ancestry fractions for All of Us samples using the Rye program. Rye uses PC data to carry out rapid and accurate genetic ancestry inference on biobank-scale datasets 47 . HGDP and 1000 Genomes reference samples were used to define a set of six distinct and coherent ancestry groups—African, East Asian, European, Middle Eastern, Latino/admixed American and South Asian—corresponding to participant self-identified race and ethnicity groups.…”
Section: Methodsmentioning
confidence: 99%