Population admixture and stratification generally occur together. Admixture between subpopulations generates allelic associations that decay with map distance, whereas stratification generates allelic associations that are independent of map distance. The autocorrelation of ancestry on gametes inherited from parents of mixed descent can be exploited to localize genes in which the pool of disease risk alleles is differentially distributed between subpopulations. Tests for linkage are based on testing for association of the disease or outcome of interest with locus ancestry, conditioning on parental admixture proportions to eliminate confounding by genetic background. This approach, known as admixture mapping, is an extension of the principles underlying linkage analysis of an experimental cross between inbred strains. Most current approaches to modelling admixture are based on a standard model in which the stochastic variation of ancestry on gametes inherited from admixed parents is generated by independent Poisson arrival processes. For such models, the posterior distribution of locus ancestry and parental admixture proportions can be generated by Markov chain Monte Carlo simulation, given a sample of individuals typed at ancestry-informative marker loci. Tests for linkage are constructed by averaging over this posterior distribution. The same statistical model can be used with unselected marker loci to detect and control for population stratification in ordinary genetic association studies. For studies using arrays of closely spaced tag SNPs, this approach has limitations as it requires that the marker loci be spaced far enough apart for all allelic association to be attributable to admixture and stratification. An alternative approach is to use principal components analysis to infer a few underlying axes of variation that summarize the allelic associations in the dataset. This approach is computationally efficient, and can be extended to datasets with closely spaced markers using a simple adjustment for short-range allelic associations. Tests of the number of underlying latent variables can be constructed; for a given F ST distance between subpopulations, there is a threshold size of dataset at which stratification can be detected. With either modelling approach, confounding by stratification can be controlled by adjusting for
1190Handbook of Statistical Genetics, Third Edition . E dited by D . J. Balding, M . Bishop and C. Cannings.