Current genome-wide association studies (GWAS) focusing on relatively common single-nucleotide polymorphisms (SNPs) usually adopt a cost-effective multi-staged design in which a proportion of the total samples are genotyped using a commercial SNP array with a reasonably good coverage of the whole genome at the initial stage, and a list of promising SNPs are further genotyped and evaluated on the remaining samples at the second stage. This staged design in principal can also be used for the study of rare genetic variants at the genome-wide scale, but the statistical methods developed for evaluating the relatively common SNPs under the staged design are not appropriate for rare variants due to the invalidity of large sample theorems. Here, we develop a new statistical framework that aims to evaluate rare variants under two-staged (or multi-staged) design. By extensive computer simulations, we evaluate the empirical type I error rate and power of the proposed procedures. A real example from two recent case-control rheumatoid arthritis genetic association studies is also used to demonstrate the performances of the proposed methods. Keywords: case-control study; GWAS; rare variants; two-staged design INTRODUCTION Current wave of genome-wide association studies (GWAS) focusing on relatively common single-nucleotide polymorphisms (SNPs) (minor allele frequency (MAF)45%) have successfully identified hundreds of loci associated with risk of various diseases. To date, more than 5900 SNPs have been reported to be associated with different diseases (http://www.genome.gov/gwastudies/). However, some studies have suggested that the genetic variants for common diseases could have a wide spectrum of frequencies, ranging from rare to common, and that rare variants could exhibit a relatively large genetic effect (for example, odds ratio greater than 2). 1-15 For example, in 2008, Stefansson et al. 10 found that three rare deletions were associated with schizophrenia with the odds ratios of 2.7, 11.5 and 14.8, respectively. Some authors have proposed novel methods to detect associations with multiple rare variants for common diseases. [16][17][18][19][20] Current GWAS with common SNPs usually adopt a cost-effective staged design in which a proportion of the available sample are genotyped using a commercially available SNP array with a reasonable coverage of the whole genome at the initial stage, and a list of promising SNPs with P-values less than a given threshold are further genotyped on the rest of the samples at the second stage. [21][22][23][24] For data analysis, it is generally more powerful to use the joint analysis strategy that combines the statistics from two stages for the final evaluation of the association evidence comparing with the replication-based analysis that only uses the statistic from the second stage. [24][25][26] In principle, this staged design can be used for future GWAS or candidate gene association studies focusing on rare genetic variants. However, the analysis of rare variant under such a multi-sta...