Recent developments in genome-wide association studies (GWAS) have lead to the localization of disease genes for many complex diseases. The scrutiny of the respective publications reveals, first, that statistical analysis is restricted typically to single-marker analysis in the first step, and that, second, the presence of multiple, independently associated SNPs within the same linkage disequilibrium (LD) region is a common phenomenon. Motivated by this observation, we show through a power simulation study that a simultaneous analysis of tightly linked SNPs in the initial GWAS analysis step would lead to increased power, when compared with that in single-marker analysis. This is true for all the three approaches we considered (implementations in BEAGLE, FAMHAP and UNPHASED). The best performance was obtained using a two-marker haplotype analysis. In conclusion, we would expect additional gene findings for re-analyzing successful GWAS with a multi-marker approach. European Journal of Human Genetics (2009Genetics ( ) 17, 1043Genetics ( -1049 doi:10.1038/ejhg.2009 published online 18 February 2009 Keywords: genome-wide association studies; haplotypes; literature study Introduction Biology strongly supports the potential importance of haplotype analysis in genetic association studies. 1 Major reasons for haplotype analysis are possible representations of non-genotyped SNPs and the presence of multiple, independent disease markers in the same linkage disequilibrium (LD) region. 2 Nevertheless, evidence that haplotypes actually lead to improved power in genetic association studies has yet to be supplied. Indeed, the increased number of degrees of freedom (d.f.) might over-compensate the benefit achieved from the improved modeling of biology obtainable with haplotypes. Besides that, it is not self-evident whether haplotype assignment is actually a prerequisite for association analysis in LD regions. Clayton et al 3 suggest, as an alternative, the analysis of unphased multi-marker genotype data in these regions.In view of the controversial situation regarding the judgement of haplotype analysis, we conducted a largescale power simulation study for investigating the relative performance of single-marker analysis, simultaneous analysis of unphased genotypes in LD regions -referred to as UMMA (unphased multi-marker analysis) from now onand haplotype analysis in genome-wide association studies (GWAS). The setup of the simulation study was guided by a priori knowledge that has become recently available. Data from the International HapMap Project 4 allowed specifying empirical LD distributions for data simulation. In addition, we extracted information on disease models typical of complex diseases, by studying a comprehensive list of recent publications on GWAS.
Results of literature studyThe guidelines for inclusion into the literature study were publication during the time period from January 2005 to 5 -42,44 -64 which met our inclusion criteria. In all publications, association analysis in the initial step was restrict...