A large array of species distribution model (SDM) approaches has been developed for explaining and predicting the occurrences of individual species or species assemblages. Given the wealth of existing models, it is unclear which models perform best for interpolation or extrapolation of existing data sets, particularly when one is concerned with species assemblages. We compared the predictive performance of 33 variants of 15 widely applied and recently emerged SDMs in the context of multispecies data, including both joint SDMs that model multiple species together, and stacked SDMs that model each species individually combining the predictions afterward. We offer a comprehensive evaluation of these SDM approaches by examining their performance in predicting withheld empirical validation data of different sizes representing five different taxonomic groups, and for prediction tasks related to both interpolation and extrapolation. We measure predictive performance by 12 measures of accuracy, discrimination power, calibration, and precision of predictions, for the biological levels of species occurrence, species richness, and community composition. Our results show large variation among the models in their predictive performance, especially for communities comprising many species that are rare. The results do not reveal any major trade‐offs among measures of model performance; the same models performed generally well in terms of accuracy, discrimination, and calibration, and for the biological levels of individual species, species richness, and community composition. In contrast, the models that gave the most precise predictions were not well calibrated, suggesting that poorly performing models can make overconfident predictions. However, none of the models performed well for all prediction tasks. As a general strategy, we therefore propose that researchers fit a small set of models showing complementary performance, and then apply a cross‐validation procedure involving separate data to establish which of these models performs best for the goal of the study.
The classical tools in spatial statistics are stationary models, like the Matérn field. However, in some applications there are boundaries, holes, or physical barriers in the study area, e.g. a coastline, and stationary models will inappropriately smooth over these features, requiring the use of a non-stationary model.We propose a new model, the Barrier model, which is different from the established methods as it is not based on the shortest distance around the physical barrier, nor on boundary conditions. The Barrier model is based on viewing the Matérn correlation, not as a correlation function on the shortest distance between two points, but as a collection of paths through a Simultaneous Autoregressive (SAR) model. We then manipulate these local dependencies to cut off paths that are crossing the physical barriers. To make the new SAR well behaved, we formulate it as a stochastic partial differential equation (SPDE) that can be discretised to represent the Gaussian field, with a sparse precision matrix that is automatically positive definite.The main advantage with the Barrier model is that the computational cost is the same as for the stationary model. The model is easy to use, and can deal with both sparse data and very complex barriers, as shown in an application in the Finnish Archipelago Sea. Additionally, the Barrier model is better at reconstructing the modified Horseshoe test function than the standard models used in R-INLA.
Gaussian process (GP) models are widely used in disease mapping as they provide a natural framework for modeling spatial correlations. Their challenges, however, lie in computational burden and memory requirements. In disease mapping models, the other difficulty is inference, which is analytically intractable due to the non-Gaussian observation model. In this paper, we address both these challenges. We show how to efficiently build fully and partially independent conditional (FIC/PIC) sparse approximations for the GP in two-dimensional surface, and how to conduct approximate inference using expectation propagation (EP) algorithm and Laplace approximation (LA). We also propose to combine FIC with a compactly supported covariance function to construct a computationally efficient additive model that can model long and short length-scale spatial correlations simultaneously. The benefit of these approximations is computational. The sparse GPs speed up the computations and reduce the memory requirements. The posterior inference via EP and Laplace approximation is much faster and is practically as accurate as via Markov chain Monte Carlo.
Summary Cyclical outbreaks of pests can impact the functioning of entire ecosystems. An eminent example is outbreaks of crown‐of‐thorns starfish (COTS; Acanthaster planci) that cause substantial coral mortality on the Great Barrier Reef (GBR). We analyse COTS abundance and outbreaks with a Bayesian spatiotemporal model applied to a long‐term survey of the GBR (1985–2014). We assess the relative increase in COTS abundance beyond that explained by a reef's location and explanatory covariates, and thereby incorporate local reef characteristics into the identification of outbreaks, while allowing for both randomness and predictable patterns in the development of outbreaks. The model results confirm that waves of COTS outbreaks originate near Lizard Island (14·67⁰S) and progress in a northwesterly or southeasterly direction, with the southward wave progressing about 60 km year−1. The model reveals several previously unidentified hotspots with high average COTS abundance. The abundance of COTS may also have decreased on reefs protected from fishing after an expansion of protected areas within the GBR Marine Park in 2004, which suggests that closing reefs to fishing may help control COTS. Synthesis and applications. In this study, we use 30 years of data from the Great Barrier Reef to show that the timing and geographic location of crown‐of‐thorns starfish (COTS) outbreaks can be modelled by incorporating covariates, spatial and spatiotemporal dependence within a single coherent framework. The model can be used to identify areas of high average COTS abundance, to assess the impact of fishery management actions such as no‐take areas and to identify areas where waves of outbreaks may originate. The identification of outbreaks from noisy long‐term spatially extensive data may help managers choose appropriate control strategies. This modelling approach is applicable to other ecosystems where outbreaks of damaging pests occur.
Predictive species distribution models are mostly based on statistical dependence between environmental and distributional data and therefore may fail to account for physiological limits and biological interactions that are fundamental when modelling species distributions under future climate conditions. Here, we developed a state-of-the-art method integrating biological theory with survey and experimental data in a way that allows us to explicitly model both physical tolerance limits of species and inherent natural variability in regional conditions and thereby improve the reliability of species distribution predictions under future climate conditions. By using a macroalga-herbivore association (Fucus vesiculosus - Idotea balthica) as a case study, we illustrated how salinity reduction and temperature increase under future climate conditions may significantly reduce the occurrence and biomass of these important coastal species. Moreover, we showed that the reduction of herbivore occurrence is linked to reduction of their host macroalgae. Spatial predictive modelling and experimental biology have been traditionally seen as separate fields but stronger interlinkages between these disciplines can improve species distribution projections under climate change. Experiments enable qualitative prior knowledge to be defined and identify cause-effect relationships, and thereby better foresee alterations in ecosystem structure and functioning under future climate conditions that are not necessarily seen in projections based on non-causal statistical relationships alone.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.