2020
DOI: 10.1007/978-3-030-45257-5_19
|View full text |Cite
|
Sign up to set email alerts
|

CluStrat: A Structure Informed Clustering Strategy for Population Stratification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 9 publications
0
4
0
Order By: Relevance
“…The simulated datasets included a wide array of configurations and were generated using the data simulator in a previous work. 22 Additionally, UK Biobank enhanced PRS (ePRS-UKB) for multiple phenotypes were used to further evaluate the model in real-world scenarios across different disease outcomes.…”
Section: Methodsmentioning
confidence: 99%
“…The simulated datasets included a wide array of configurations and were generated using the data simulator in a previous work. 22 Additionally, UK Biobank enhanced PRS (ePRS-UKB) for multiple phenotypes were used to further evaluate the model in real-world scenarios across different disease outcomes.…”
Section: Methodsmentioning
confidence: 99%
“…It is worth noting that there is a long line of research on matrix sketching methods, including gaussian sketching, the use of the subsampled randomized hadamard transforms, the count-min sketch, etc. and its application in human genetics [1][2][3]. In our work, we evaluated both the count-min sketch and the gaussian sketch.…”
Section: Mask-lmmmentioning
confidence: 99%
“…We generated simulated data emulating real-world populations to evaluate whether ThreSPCA can correctly identify markers which contribute to the genetic differences between and within the populations. Based on previous work [32], we simulated two datasets varying m = {5000, 10000} SNPs genotyped across n = {500, 1000} individuals based on the Pritchard-Stephens-Donelly (PSD) model [33] with the mixing parameter between populations, α = 0.01. The allele frequencies were simulated based on real-world data from three divergent populations, namely CEU (Utah residents with Northern and Western European ancestry), ASW (African ancestry in Southwestern US), and MXL (Mexican ancestry in California) from the HapMap Phase 3 data [34].…”
Section: Datamentioning
confidence: 99%