A principal objective for phylogenetic experimental design is to predict the power of a data set to resolve nodes in a phylogenetic tree. However, proactively assessing the potential for phylogenetic noise compared with signal in a candidate data set has been a formidable challenge. Understanding the impact of collection of additional sequence data to resolve recalcitrant internodes at diverse historical times will facilitate increasingly accurate and cost-effective phylogenetic research. Here, we derive theory based on the fundamental unit of the phylogenetic tree, the quartet, that applies estimates of the state space and the rates of evolution of characters in a data set to predict phylogenetic signal and phylogenetic noise and therefore to predict the power to resolve internodes. We develop and implement a Monte Carlo approach to estimating power to resolve as well as deriving a nearly equivalent faster deterministic calculation. These approaches are applied to describe the distribution of potential signal, polytomy, or noise for two example data sets, one recent (cytochrome c oxidase I and 28S ribosomal rRNA sequences from Diplazontinae parasitoid wasps) and one deep (eight nuclear genes and a phylogenomic sequence for diverse microbial eukaryotes including Stramenopiles, Alveolata, and Rhizaria). The predicted power of resolution for the loci analyzed is consistent with the historic use of the genes in phylogenetics.
With the rise of genome- scale datasets there has been a call for increased data scrutiny and careful selection of loci appropriate for attempting the resolution of a phylogenetic problem. Such loci are desired to maximize phylogenetic information content while minimizing the risk of homoplasy. Theory posits the existence of characters that evolve under such an optimum rate, and efforts to determine optimal rates of inference have been a cornerstone of phylogenetic experimental design for over two decades. However, both theoretical and empirical investigations of optimal rates have varied dramatically in their conclusions: spanning no relationship to a tight relationship between the rate of change and phylogenetic utility. Here we synthesize these apparently contradictory views, demonstrating both empirical and theoretical conditions under which each is correct. We find that optimal rates of characters-not genes-are generally robust to most experimental design decisions. Moreover, consideration of site rate heterogeneity within a given locus is critical to accurate predictions of utility. Factors such as taxon sampling or the targeted number of characters providing support for a topology are additionally critical to the predictions of phylogenetic utility based on the rate of character change. Further, optimality of rates and predictions of phylogenetic utility are not equivalent, demonstrating the need for further development of comprehensive theory of phylogenetic experimental design.
Purpose
To identify trends in physician drug prescribing practices for sickle cell disease (SCD).
Methods
We used data from the National Disease and Therapeutic Index to evaluate medications prescribed to children (definition: aged 19 years or younger) and adults (20 years or older) with SCD by office‐based physicians in the United States during 1997 to 2017. Prescriptions were evaluated in 3‐year intervals.
Results
The proportion of SCD visits that included new/continued hydroxyurea prescriptions increased from less than or equal to 8% before 2009 to 33% in 2015 to 2017. The increase was significant in visits by children (2.5% in 1997‐1999 to 47% in 2015‐2017; P = .003 by Spearman's rank‐order correlation) but not in adults (6.9% to 11%; P = .12). Opioids, started/continued in 13% (lowest 3‐year average) to 35% (highest) of visits by children and 55% to 81% of visits by adults, remained the most frequently prescribed medications for SCD overall. There were no significant changes over time in opioid prescribing for adults (P = .64) or children (P = .38). Hematologists/oncologists accounted for a higher proportion of visits by children (67.2% over 1997‐2017) than adults (25.2%), while emergency medicine visits were higher in adults (14.0%) than children (2.6%).
Conclusions
This study suggests a robust increase in hydroxyurea prescribing for children with SCD. The BABY HUG trial, which demonstrated safety and efficacy of starting hydroxyurea in infancy and informed current SCD guidelines recommending broader use in children, may have contributed to this increase. However, hydroxyurea prescribing for adults remains infrequent and considerably lower than opioids. Barriers in access to specialist care persist for adults with SCD.
BackgroundThe detection and avoidance of “long-branch effects” in phylogenetic inference represents a longstanding challenge for molecular phylogenetic investigations. A consequence of parallelism and convergence, long-branch effects arise in phylogenetic inference when there is unequal molecular divergence among lineages, and they can positively mislead inference based on parsimony especially, but also inference based on maximum likelihood and Bayesian approaches. Long-branch effects have been exhaustively examined by simulation studies that have compared the performance of different inference methods in specific model trees and branch length spaces.ResultsIn this paper, by generalizing the phylogenetic signal and noise analysis to quartets with uneven subtending branches, we quantify the utility of molecular characters for resolution of quartet phylogenies via parsimony. Our quantification incorporates contributions toward the correct tree from either signal or homoplasy (i.e. “the right result for either the right reason or the wrong reason”). We also characterize a highly conservative lower bound of utility that incorporates contributions to the correct tree only when they correspond to true, unobscured parsimony-informative sites (i.e. “the right result for the right reason”). We apply the generalized signal and noise analysis to classic quartet phylogenies in which long-branch effects can arise due to unequal rates of evolution or an asymmetrical topology. Application of the analysis leads to identification of branch length conditions in which inference will be inconsistent and reveals insights regarding how to improve sampling of molecular loci and taxa in order to correctly resolve phylogenies in which long-branch effects are hypothesized to exist.ConclusionsThe generalized signal and noise analysis provides analytical prediction of utility of characters evolving at diverse rates of evolution to resolve quartet phylogenies with unequal branch lengths. The analysis can be applied to identifying characters evolving at appropriate rates to resolve phylogenies in which long-branch effects are hypothesized to occur.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.