Genome-wide association analysis is a powerful tool to identify genomic loci underlying complex traits. However, the application in natural populations comes with challenges, especially power loss due to population stratification. Here, we introduce a bivariate analysis approach to a GWAS dataset of Arabidopsis thaliana. A common allele, Zan et al. (2017) 1/31. CC-BY-NC-ND 4.0 International license peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was not . http://dx.doi.org/10.1101/193417 doi: bioRxiv preprint first posted online Sep. 25, 2017; Submitted Manuscript strongly confounded with population structure, is discovered to be associated with late flowering and slow maturation of the plant. The discovered genetic effect on flowering time is further replicated in independent datasets. Using Mendelian randomization analysis based on summary statistics from our GWAS and expression QTL scans, we predicted and replicated a candidate gene AT1G11560 that potentially causes this association. Further analysis indicates that this locus is co-selected with flowering-time-related genes. We demonstrate the efficiency of multi-phenotype analysis to uncover hidden genetic loci masked by population structure. The discovered pleiotropic genotype-phenotype map provides new insights into understanding the genetic correlation of complex traits.
Author SummaryJoint-analyzing multiple phenotypes is of increasing interest in this post-GWAS era, because of its potential power to reveal more discoveries and its potential insights into pleiotropic genetic architecture. Here, using publicly available A. thaliana data, we provide a "textbook" empirical evidence showing how a novel allele, highly confounded with population structure but carries a large genetic effect, can be detected via a double-trait analysis. The allele postpones the flowering time and maturation endpoint of the plant at the same time. The discovered genetic effect can be replicated. We illustrate the bivariate genotype-phenotype map that produces such statistical power.Combining with gene expression genomic scans, we also predict candidate genes using summary-level Mendelian randomization analysis. The results indicate that multi-phenotype analysis is a powerful and reliable strategy to uncover additional value in the established GWAS data.