Exome sequences, which comprise all protein-coding regions, are promising data sets for studies of natural selection because they offer unbiased genome-wide estimates of polymorphism while focusing on the portions of the genome that are most likely to be functionally important. We examine genomic patterns of polymorphism within 10 diploid autosomal exomes of European and African descent. Using coalescent simulations, we show how polymorphism, site frequency spectra, and intercontinental divergence in these samples would be influenced by different modes of positive selection. We examine putatively selected loci from four previous genome-wide scans of SNP genotypes and demonstrate that these regions indeed show unusual population genetic patterns in the exome data. Using a series of conservative criteria based on exome polymorphism, we are able to fine-scale map signatures of selection, in many cases pinpointing a single candidate SNP. We also identify and evaluate novel candidate selection genes that show unusual patterns of polymorphism. We sequence a portion of one novel candidate locus, IVL, in 74 individuals from multiple continents and examine global genetic diversity. Thus, we confirm, narrow, and supplement existing catalogs of putative targets of selection, and show that exome data sets, which are likely to soon become common, will be powerful tools for identifying adaptive genetic variation.[Supplemental material is available online at http://www.genome.org.] In order to extract well-supported regions of recent adaptation from existing catalogs of putatively selected loci, it is important to reevaluate and refine such lists using data that are free from ascertainment biases. Fortunately, more ideal genome-wide data sets are beginning to emerge. These include sets of all genomic exons, or ''exomes,'' which are more practical to sequence at high coverage in multiple individuals than whole genomes. Although the sample sizes are still small, analysis of these genome-wide sequence data sets can be useful for evolutionary studies, as the unbiased estimates of polymorphism and divergence they provide can be used to assess previously identified candidate regions under selection and more precisely determine targets of selection.Here, we analyze the autosomal exomes of four African and six European individuals (Ng et al. 2009). We first perform coalescent simulations with selection to evaluate whether selection could leave a signature in the exomes of a small number of individuals. We then test whether genomic regions previously identified as possible targets of positive selection show evidence of non-neutrality in the exome data, and we filter the candidate regions accordingly. We also identify and evaluate several novel regions of unusual polymorphism suggestive of positive selection, and we collect and analyze additional sequence data for one of the most interesting novel genes.
Results
SimulationsIn order to test the hypothesis that polymorphism and divergence in a small exome data set of four African and six ...