2017
DOI: 10.1093/bioinformatics/btx369
|View full text |Cite
|
Sign up to set email alerts
|

A multi-scenario genome-wide medical population genetics simulation framework

Abstract: Supplementary data are available at Bioinformatics online.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 17 publications
0
4
0
Order By: Relevance
“…To facilitate the assessment of common GWAS tools, we simulated homogeneous and heterogeneous datasets based on haplotypes from the 1000 Genomes project spanning the genome and realistic enough to mimic African, European, and admixed populations to challenge the statistical methods for association testing in real-world conditions. We used a resampling model with recombination breakpoints while mimicking mutation rates as implemented in FractalSIM (Mugo et al, 2017).…”
Section: Methodsmentioning
confidence: 99%
“…To facilitate the assessment of common GWAS tools, we simulated homogeneous and heterogeneous datasets based on haplotypes from the 1000 Genomes project spanning the genome and realistic enough to mimic African, European, and admixed populations to challenge the statistical methods for association testing in real-world conditions. We used a resampling model with recombination breakpoints while mimicking mutation rates as implemented in FractalSIM (Mugo et al, 2017).…”
Section: Methodsmentioning
confidence: 99%
“…The genetic model with 30 disease-associated SNPs out of total 10,000 preserves the characteristic sparsity of “true signal” in GWAS datasets, while keeping the dataset size manageable for rigorous simulations. Number of samples, number of SNPs, number of disease-associated SNPs, and effect sizes were set in accordance with precedence in literature[106-108], where in particular, effect sizes were selected to ensure genotype-specific disease odds ratio remained realistic (in the range: 1-3)[83-85] (S3 Fig). For each simulated true disease phenotype Y’ , differential misclassification was introduced at varying degrees by switching a fraction of randomly selected controls to cases.…”
Section: Methodsmentioning
confidence: 99%
“…The genetic model with 300 disease-associated SNPs with realistic effect sizes out of total 100,000 SNPs preserves the characteristic sparsity of "true signal" in GWAS datasets, with phenotypic variance explained by each SNP in the empirically observed range of 1e − 9 %-3%. Overall, the number of SNPs, number of diseaseassociated SNPs, phenotype heritability values, and simulated effect sizes were set in accordance with precedence in literature [86,87,[138][139][140].…”
Section: Simulation Study Simulation Datasetsmentioning
confidence: 99%