We develop an inference method that uses approximate Bayesian computation (ABC) to simultaneously estimate mutational parameters and selective constraint on the basis of nucleotide divergence for proteincoding genes between pairs of species. Our simulations explicitly model CpG hypermutability and transition vs. transversion mutational biases along with negative and positive selection operating on synonymous and nonsynonymous sites. We evaluate the method by simulations in which true mean parameter values are known and show that it produces reasonably unbiased parameter estimates as long as sequences are not too short and sequence divergence is not too low. We show that the use of quadratic regression within ABC offers an improvement over linear regression, but that weighted regression has little impact on the efficiency of the procedure. We apply the method to estimate mutational and selective constraint parameters in data sets of protein-coding genes extracted from the genome sequences of primates, murids, and carnivores. Estimates of CpG hypermutability are substantially higher in primates than murids and carnivores. Nonsynonymous site selective constraint is substantially higher in murids and carnivores than primates, and autosomal nonsynonymous constraint is higher than X-chromsome constraint in all taxa. We detect significant selective constraint at synonymous sites in primates, carnivores, and murid rodents. Synonymous site selective constraint is weakest in murids, a surprising result, considering that murid effective population sizes are likely to be considerably higher than the other two taxa.
WHAT fraction of new mutations in the genome are influenced by natural selection? One way to address this question is to compare levels of betweenspecies nucleotide divergence at classes of candidate selectively evolving and neutrally evolving sites. For example, under the assumptions that nonsynonymous mutations are either strongly deleterious or neutral and there exists a class of sites that evolves neutrally, the proportion of deleterious amino acid-changing mutations in a protein-coding gene can be estimated from The neutral substitution rate, D Neutral , has often been assumed to be equal to D S , the rate of synonymous substitutions in protein-coding genes. That assumption, however, is not justified in many species (reviewed by Hershberg and Petrov 2008), and even in mammals some form of selection appears to operate on synonymous mutations . For example, between-species divergence at fourfold degenerate sites is significantly lower than in ancestral transposable element repeats (ARs) (Eö ry et al. 2010), and ARs are among the best candidates for a class of sites that evolves neutrally (Lunter et al. 2006;Meader et al. 2010;Pollard et al. 2010). If we employ ARs as a neutral reference, nonsynonymous and synonymous selective constraint can be estimated asandrespectively, where D AR is the substitution rate for intronic ARs. Note that intronic ARs represent a better local neutral reference than interge...