Genes encoding nuclear receptors (NRs) are attractive as candidates for investigating the evolution of gene regulation because they (1) have a direct effect on gene expression and (2) modulate many cellular processes that underlie development. We employed a threephase investigation linking NR molecular evolution among primates with direct experimental assessment of NR function. Phase 1 was an analysis of NR domain evolution and the results were used to guide the design of phase 2, a codon-model-based survey for alterations of natural selection within the hominids. By using a series of reliability and robustness analyses we selected a single gene, NR2C1, as the best candidate for experimental assessment. We carried out assays to determine whether changes between the ancestral and extant NR2C1s could have impacted stem cell pluripotency (phase 3). We evaluated human, chimpanzee, and ancestral NR2C1 for transcriptional modulation of Oct4 and Nanog (key regulators of pluripotency and cell lineage commitment), promoter activity for Pepck (a proxy for differentiation in numerous cell types), and average size of embryological stem cell colonies (a proxy for the self-renewal capacity of pluripotent cells). Results supported the signal for alteration of natural selection identified in phase 2. We suggest that adaptive evolution of gene regulation has impacted several aspects of pluripotentiality within primates. Our study illustrates that the combination of targeted evolutionary surveys and experimental analysis is an effective strategy for investigating the evolution of gene regulation with respect to developmental phenotypes.KEYWORDS ancestral gene reconstruction (AGR); codon models; hominid evolutionary survey; nuclear receptors; NR2C1; testicular receptor 2 (TR2); pluripotentiality H UMAN evolutionary biology seeks to understand the origins of the defining characteristics of modern humans, such as our large brains, upright posture, obligatory bipedal gait, longevity, and extended juvenile period. While fossil morphology and artifacts recovered from archaeological sites are essential to inferring anatomical structure, function, and behavior in the past (Mcbrearty and Brooks 2000;Alemseged et al. 2006;Tryon et al. 2008; Jungers et al. 2009a,b;Braun et al. 2010;Ward et al. 2011), only through molecular genetic analyses can we make the ultimate connection between phenotype and genotype (Wood 1996;Allman et al. 2010;Boddy et al. 2012;Sherwood and Duka 2012). The eventual goal is to understand to what extent modern structures and functions are determined by different genetic systems and the extent to which the evolution of those systems has played a role in the evolution of the human lineage. A superfamily of transcription factors called the nuclear receptors (NRs) are attractive candidates for a combined evolutionary and functional investigation of hominids (e.g., the clade that includes modern great apes and their last common ancestors). As transcription factors, NRs control many aspects of development, metabo...
This unit provides protocols for using the CODEML program from the PAML package to make inferences about episodic natural selection in protein-coding sequences. The protocols cover inference tasks such as maximum likelihood estimation of selection intensity, testing the hypothesis of episodic positive selection, and identifying sites with a history of episodic evolution. We provide protocols for using the rich set of models implemented in CODEML to assess robustness, and for using bootstrapping to assess if the requirements for reliable statistical inference have been met. An example dataset is used to illustrate how the protocols are used with real protein-coding sequences. The workflow of this design, through automation, is readily extendable to a larger-scale evolutionary survey. © 2016 by John Wiley & Sons, Inc.
To detect positive selection at individual amino acid sites, most methods use an empirical Bayes approach. After parameters of a Markov process of codon evolution are estimated via maximum likelihood, they are passed to Bayes formula to compute the posterior probability that a site evolved under positive selection. A difficulty with this approach is that parameter estimates with large errors can negatively impact Bayesian classification. By assigning priors to some parameters, Bayes Empirical Bayes (BEB) mitigates this problem. However, as implemented, it imposes uniform priors, which causes it to be overly conservative in some cases. When standard regularity conditions are not met and parameter estimates are unstable, inference, even under BEB, can be negatively impacted. We present an alternative to BEB called smoothed bootstrap aggregation (SBA), which bootstraps site patterns from an alignment of protein coding DNA sequences to accommodate the uncertainty in the parameter estimates. We show that deriving the correction for parameter uncertainty from the data in hand, in combination with kernel smoothing techniques, improves site specific inference of positive selection. We compare BEB to SBA by simulation and real data analysis. Simulation results show that SBA balances accuracy and power at least as well as BEB, and when parameter estimates are unstable, the performance gap between BEB and SBA can widen in favor of SBA. SBA is applicable to a wide variety of other inference problems in molecular evolution.
Motivation Likelihood ratio tests are commonly used to test for positive selection acting on proteins. They are usually applied with thresholds for declaring a protein under positive selection determined from a chi-square or mixture of chi-square distributions. Although it is known that such distributions are not strictly justified due to the statistical irregularity of the problem, the hope has been that the resulting tests are conservative and do not lose much power in comparison with the same test using the unknown, correct threshold. We show that commonly used thresholds need not yield conservative tests, but instead give larger than expected Type I error rates. Statistical regularity can be restored by using a modified likelihood ratio test. Results We give theoretical results to prove that, if the number of sites is not too small, the modified likelihood ratio test gives approximately correct Type I error probabilities regardless of the parameter settings of the underlying null hypothesis. Simulations show that modification gives Type I error rates closer to those stated without a loss of power. The simulations also show that parameter estimation for mixture models of codon evolution can be challenging in certain data-generation settings with very different mixing distributions giving nearly identical site pattern distributions unless the number of taxa and tree length are large. Because mixture models are widely used for a variety of problems in molecular evolution, the challenges and general approaches to solving them presented here are applicable in a broader context. Availability and implementation https://github.com/jehops/codeml_modl Supplementary information Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.