Total-evidence dating (TED) allows evolutionary biologists to incorporate a wide range of dating information into a unified statistical analysis. One might expect this to improve the agreement between rocks and clocks but this is not necessarily the case. We explore the reasons for such discordance using a mammalian dataset with rich molecular, morphological and fossil information. There is strong conflict in this dataset between morphology and molecules under standard stochastic models. This causes TED to push divergence events back in time when using inadequate models or vague priors, a phenomenon we term ‘deep root attraction’ (DRA). We identify several causes of DRA. Failure to account for diversified sampling results in dramatic DRA, but this can be addressed using existing techniques. Inadequate morphological models also appear to be a major contributor to DRA. The major reason seems to be that current models do not account for dependencies among morphological characters, causing distorted topology and branch length estimates. This is particularly problematic for huge morphological datasets, which may contain large numbers of correlated characters. Finally, diversification and fossil sampling priors that do not incorporate all the available background information can contribute to DRA, but these priors can also be used to compensate for DRA. Specifically, we show that DRA in the mammalian dataset can be addressed by introducing a modest extra penalty for ghost lineages that are unobserved in the fossil record, for instance by assuming rapid diversification, rare extinction or high fossil sampling rate; any of these assumptions produces highly congruent divergence time estimates with a minimal gap between rocks and clocks. Under these conditions, fossils have a stabilizing influence on divergence time estimates and significantly increase the precision of those estimates, which are generally close to the dates suggested by palaeontologists.This article is part of the themed issue ‘Dating species divergences using rocks and clocks’.
DNA metabarcoding allows the analysis of insect communities faster and more efficiently than ever before. However, metabarcoding can be conducted through several approaches, and the consistency of results across methods has rarely been studied. We compare the results obtained by DNA metabarcoding of the same communities using two different markers – COI and 16S – and three different sampling methods: (a) homogenized Malaise trap samples (homogenate), (b) preservative ethanol from the same samples, and (c) soil samples. Our results indicate that COI and 16S offer partly complementary information on Malaise trap samples, with each marker detecting a significant number of species not detected by the other. Different sampling methods offer highly divergent estimates of community composition. The community recovered from preservative ethanol of Malaise trap samples is significantly different from that recovered from homogenate. Small and weakly sclerotized insects tend to be overrepresented in ethanol while strong and large taxa are overrepresented in homogenate. For soil samples, highly degenerate COI primers pick up large amounts of nontarget DNA and only 16S provides adequate analyses of insect diversity. However, even with 16S, very little overlap in molecular operational taxonomic unit (MOTU) content was found between the trap and the soil samples. Our results demonstrate that none of the tested sampling approaches is satisfactory on its own. For instance, DNA extraction from preservative ethanol is not a valid replacement for destructive bulk extraction but a complement. In future metabarcoding studies, both should ideally be used together to achieve comprehensive representation of the target community.
The Swedish Malaise Trap Project (SMTP) is one of the most ambitious insect inventories ever attempted. The project was designed to target poorly known insect groups across a diverse range of habitats in Sweden. The field campaign involved the deployment of 73 Malaise traps at 55 localities across the country for three years (2003-2006). Over the past 15 years, the collected material has been hand sorted by trained technicians into over 300 taxonomic fractions suitable for expert attention. The resulting collection is a tremendous asset for entomologists around the world, especially as we now face a desperate need for baseline data to evaluate phenomena like insect decline and climate change. Here, we describe the history, organisation, methodology and logistics of the SMTP, focusing on the rationale for the decisions taken and the lessons learned along the way. The SMTP represents one of the early instances of community science applied to large-scale inventory work, with a heavy reliance on volunteers in both the field and the laboratory. We give estimates of both staff effort and volunteer effort involved. The project has been funded by the Swedish Taxonomy Initiative; in total, the inventory has cost less than 30 million SEK (approximately 3.1 million USD). Based on a subset of the samples, we characterise the size and taxonomic composition of the SMTP material. Several different extrapolation methods suggest that the material comprises around 20 million specimens in total. The material is dominated by Diptera (75% of the specimens) and Hymenoptera (15% of specimens). Amongst the Diptera, the dominant groups are Chironomidae (37% of specimens), Sciaridae (15%), Phoridae (13%), Cecidomyiidae (9.5%) and Mycetophilidae (9.4%). Within Hymenoptera, the major groups are Ichneumonidae (44% of specimens), Diaprioidea (19%), Braconidae (9.6%), Platygastroidea (8.5%) and Chalcidoidea (7.9%). The taxonomic composition varies with latitude and season. Several Diptera and Hymenoptera groups are more common in non-summer samples (collected from September to April) and in the North, while others show the opposite pattern. About 1% of the total material has been processed and identified by experts so far. This material represents over 4,000 species. One third of these had not been recorded from Sweden before and almost 700 of them are new to science. These results reveal the large amounts of taxonomic work still needed on Palaearctic insect faunas. Based on the SMTP experiences, we discuss aspects of planning and conducting future large-scale insect inventory projects using mainly traditional approaches in relation to more recent approaches that rely on molecular techniques.
Over recent years, several alternative relaxed clock models have been proposed in the context of Bayesian dating. These models fall in two distinct categories: uncorrelated and autocorrelated across branches. The choice between these two classes of relaxed clocks is still an open question. More fundamentally, the true process of rate variation may have both long-term trends and short-term fluctuations, suggesting that more sophisticated clock models unfolding over multiple time scales should ultimately be developed. Here, a mixed relaxed clock model is introduced, which can be mechanistically interpreted as a rate variation process undergoing short-term fluctuations on the top of Brownian long-term trends. Statistically, this mixed clock represents an alternative solution to the problem of choosing between autocorrelated and uncorrelated relaxed clocks, by proposing instead to combine their respective merits. Fitting this model on a dataset of 105 placental mammals, using both node-dating and tip-dating approaches, suggests that the two pure clocks, Brownian and white noise, are rejected in favour of a mixed model with approximately equal contributions for its uncorrelated and autocorrelated components. The tip-dating analysis is particularly sensitive to the choice of the relaxed clock model. In this context, the classical pure Brownian relaxed clock appears to be overly rigid, leading to biases in divergence time estimation. By contrast, the use of a mixed clock leads to more recent and more reasonable estimates for the crown ages of placental orders and superorders. Altogether, the mixed clock introduced here represents a first step towards empirically more adequate models of the patterns of rate variation across phylogenetic trees.This article is part of the themed issue ‘Dating species divergences using rocks and clocks’.
Statistical phylogenetic analysis currently relies on complex, dedicated software packages, making it difficult for evolutionary biologists to explore new models and inference strategies. Recent years have seen more generic solutions based on probabilistic graphical models, but this formalism can only partly express phylogenetic problems. Here, we show that universal probabilistic programming languages (PPLs) solve the expressivity problem, while still supporting automated generation of efficient inference algorithms. To prove the latter point, we develop automated generation of sequential Monte Carlo (SMC) algorithms for PPL descriptions of arbitrary biological diversification (birth-death) models. SMC is a new inference strategy for these problems, supporting both parameter inference and efficient estimation of Bayes factors that are used in model testing. We take advantage of this in automatically generating SMC algorithms for several recent diversification models that have been difficult or impossible to tackle previously. Finally, applying these algorithms to 40 bird phylogenies, we show that models with slowing diversification, constant turnover and many small shifts generally explain the data best. Our work opens up several related problem domains to PPL approaches, and shows that few hurdles remain before these techniques can be effectively applied to the full range of phylogenetic models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.