Background
Considering relatives’ health history in logistic regression for case–control genome-wide association studies (CC-GWAS) may provide new information that increases accuracy and power to detect disease associated genetic variants. We conducted simulations and analyzed type 2 diabetes (T2D) data from the Framingham Heart Study (FHS) to compare two methods, liability threshold model conditional on both case–control status and family history (LT-FH) and Fam-meta, which incorporate family history into CC-GWAS.
Results
In our simulation scenario of trait with modest T2D heritability (h2 = 0.28), variant minor allele frequency ranging from 1% to 50%, and 1% of phenotype variance explained by the genetic variants, Fam-meta had the highest overall power, while both methods incorporating family history were more powerful than CC-GWAS. All three methods had controlled type I error rates, while LT-FH was the most conservative with a lower-than-expected error rate. In addition, we observed a substantial increase in power of the two familial history methods compared to CC-GWAS when the prevalence of the phenotype increased with age. Furthermore, we showed that, when only the phenotypes of more distant relatives were available, Fam-meta still remained more powerful than CC-GWAS, confirming that leveraging disease history of both close and distant relatives can increase power of association analyses. Using FHS data, we confirmed the well-known association of TCF7L2 region with T2D at the genome-wide threshold of P-value < 5 × 10–8, and both familial history methods increased the significance of the region compared to CC-GWAS. We identified two loci at 5q35 (ADAMTS2) and 5q23 (PRR16), not previously reported for T2D using CC-GWAS and Fam-meta; both genes play a role in cardiovascular diseases. Additionally, CC-GWAS detected one more significant locus at 13q31 (GPC6) reported associated with T2D-related traits.
Conclusions
Overall, LT-FH and Fam-meta had higher power than CC-GWAS in simulations, especially using phenotypes that were more prevalent in older age groups, and both methods detected known genetic variants with lower P-values in real data application, highlighting the benefits of including family history in genetic association studies.