2018
DOI: 10.1002/mpr.1608
|View full text |Cite
|
Sign up to set email alerts
|

A tutorial on conducting genome‐wide association studies: Quality control and statistical analysis

Abstract: ObjectivesGenome‐wide association studies (GWAS) have become increasingly popular to identify associations between single nucleotide polymorphisms (SNPs) and phenotypic traits. The GWAS method is commonly applied within the social sciences. However, statistical analyses will need to be carefully conducted and the use of dedicated genetics software will be required. This tutorial aims to provide a guideline for conducting genetic analyses.MethodsWe discuss and explain key concepts and illustrate how to conduct … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
458
0
2

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
4
2

Relationship

1
9

Authors

Journals

citations
Cited by 569 publications
(500 citation statements)
references
References 51 publications
(58 reference statements)
0
458
0
2
Order By: Relevance
“…However, it is overly strict for densely genotyped and imputed studies where correlations between variants exist (Power et al, 2016b) and requires much larger sample sizes in order to detect causal variants. To overcome the issue of strictness, some tools (Jaillard et al, 2018) implement the Benjamini Hochberg false discovery rate (FDR) (Benjamini and Hochberg, 1995), a less stringent method to control for multiple testing Type I errors but it has also been found to be conservative (Storey and Tibshirani, 2003) as it assumes that SNPs are independent, which is seldom true (Marees et al, 2018). Understanding the level of LD between SNPs and computing an appropriate significance threshold that is optimal for each study (Visscher et al, 2012) therefore presents a feasible and ideal solution.…”
Section: Analytical Considerations and Pitfallsmentioning
confidence: 99%
“…However, it is overly strict for densely genotyped and imputed studies where correlations between variants exist (Power et al, 2016b) and requires much larger sample sizes in order to detect causal variants. To overcome the issue of strictness, some tools (Jaillard et al, 2018) implement the Benjamini Hochberg false discovery rate (FDR) (Benjamini and Hochberg, 1995), a less stringent method to control for multiple testing Type I errors but it has also been found to be conservative (Storey and Tibshirani, 2003) as it assumes that SNPs are independent, which is seldom true (Marees et al, 2018). Understanding the level of LD between SNPs and computing an appropriate significance threshold that is optimal for each study (Visscher et al, 2012) therefore presents a feasible and ideal solution.…”
Section: Analytical Considerations and Pitfallsmentioning
confidence: 99%
“…We did not impute any genotypes to prevent false positive associations and a larger multiple testing burden. There were 551,839 typed SNPs; subsequent SNP and individual filtering and trimming was based on 1) SNPs with > 20% missing data (239 removed), 2) individuals with > 20% missing data (0 removed), 3) minor allele frequency < 0.01 to remove rare variant associations (260,269 removed), 4) SNPs out of Hardy Weinberg equilibrium for quantitative traits (58 removed due to P<1e -6 ) (40). All samples passed kinship and heterozygosity thresholds after the filtering outlined above, leaving 62 samples and 291,273 SNPs to analyse.…”
Section: Methodsmentioning
confidence: 99%
“…Quality Control (QC) was done in two steps: 1) the missingness threshold for SNPs was set to 0.2 and 28,610 SNPs were excluded; 2) two thresholds for the Hardy-Weinberg equilibrium were used: 10 −10 was the P-value threshold for excluding SNPs in HP cases, and 10 −6 was the P-value threshold for excluding SNPs in controls 14 . Ten SNPs in HP cases and 10 SNPs in controls failed the test.…”
Section: Methodsmentioning
confidence: 99%