Collinearity refers to the non independence of predictor variables, usually in a regression‐type analysis. It is a common feature of any descriptive ecological data set and can be a problem for parameter estimation because it inflates the variance of regression parameters and hence potentially leads to the wrong identification of relevant predictors in a statistical model. Collinearity is a severe problem when a model is trained on data from one region or time, and predicted to another with a different or unknown structure of collinearity. To demonstrate the reach of the problem of collinearity in ecology, we show how relationships among predictors differ between biomes, change over spatial scales and through time. Across disciplines, different approaches to addressing collinearity problems have been developed, ranging from clustering of predictors, threshold‐based pre‐selection, through latent variable methods, to shrinkage and regularisation. Using simulated data with five predictor‐response relationships of increasing complexity and eight levels of collinearity we compared ways to address collinearity with standard multiple regression and machine‐learning approaches. We assessed the performance of each approach by testing its impact on prediction to new data. In the extreme, we tested whether the methods were able to identify the true underlying relationship in a training dataset with strong collinearity by evaluating its performance on a test dataset without any collinearity. We found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold‐based pre‐selection. Our results highlight the value of GLM in combination with penalised methods (particularly ridge) and threshold‐based pre‐selection when omitted variables are considered in the final interpretation. However, all approaches tested yielded degraded predictions under change in collinearity structure and the ‘folk lore’‐thresholds of correlation coefficients between predictor variables of |r| >0.7 was an appropriate indicator for when collinearity begins to severely distort model estimation and subsequent prediction. The use of ecological understanding of the system in pre‐analysis variable selection and the choice of the least sensitive statistical approaches reduce the problems of collinearity, but cannot ultimately solve them.
Species distributional or trait data based on range map (extent-of-occurrence) or atlas survey data often display spatial autocorrelation, i.e. locations close to each other exhibit more similar values than those further apart. If this pattern remains present in the residuals of a statistical model based on such data, one of the key assumptions of standard statistical analyses, that residuals are independent and identically distributed (i.i.d), is violated. The violation of the assumption of i.i.d. residuals may bias parameter estimates and can increase type I error rates (falsely rejecting the null hypothesis of no effect). While this is increasingly recognised by researchers analysing species distribution data, there is, to our knowledge, no comprehensive overview of the many available spatial statistical methods to take spatial autocorrelation into account in tests of statistical significance. Here, we describe six different statistical approaches to infer correlates of species' distributions, for both presence/absence (binary response) and species abundance data (poisson or normally distributed response), while accounting for spatial autocorrelation in model residuals: autocovariate regression; spatial eigenvector mapping; generalised least squares; (conditional and simultaneous) autoregressive models and generalised estimating equations. A comprehensive comparison of the relative merits of these methods is beyond the scope of this paper. To demonstrate each method's implementation, however, we undertook preliminary tests based on simulated data. These preliminary tests verified that most of the spatial modeling techniques we examined showed good type I error control and precise parameter estimates, at least when confronted with simplistic simulated data containing
Restrictions on roaming Until the past century or so, the movement of wild animals was relatively unrestricted, and their travels contributed substantially to ecological processes. As humans have increasingly altered natural habitats, natural animal movements have been restricted. Tucker et al. examined GPS locations for more than 50 species. In general, animal movements were shorter in areas with high human impact, likely owing to changed behaviors and physical limitations. Besides affecting the species themselves, such changes could have wider effects by limiting the movement of nutrients and altering ecological interactions. Science , this issue p. 466
Statistical models are the traditional choice to test scientific theories when observations, processes or boundary conditions are subject to stochasticity. Many important systems in ecology and biology, however, are difficult to capture with statistical models. Stochastic simulation models offer an alternative, but they were hitherto associated with a major disadvantage: their likelihood functions can usually not be calculated explicitly, and thus it is difficult to couple them to well-established statistical theory such as maximum likelihood and Bayesian statistics. A number of new methods, among them Approximate Bayesian Computing and Pattern-Oriented Modelling, bypass this limitation. These methods share three main principles: aggregation of simulated and observed data via summary statistics, likelihood approximation based on the summary statistics, and efficient sampling. We discuss principles as well as advantages and caveats of these methods, and demonstrate their potential for integrating stochastic simulation models into a unified framework for statistical modelling.
Biological control of pests by natural enemies is a major ecosystem service delivered to agriculture worldwide. Quantifying and predicting its effectiveness at large spatial scales is critical for increased sustainability of agricultural production. Landscape complexity is known to benefit natural enemies, but its effects on interactions between natural enemies and the consequences for crop damage and yield are unclear. Here, we show that pest control at the landscape scale is driven by differences in natural enemy interactions across landscapes, rather than by the effectiveness of individual natural enemy guilds. In a field exclusion experiment, pest control by flying insect enemies increased with landscape complexity. However, so did antagonistic interactions between flying insects and birds, which were neutral in simple landscapes and increasingly negative in complex landscapes. Negative natural enemy interactions thus constrained pest control in complex landscapes. These results show that, by altering natural enemy interactions, landscape complexity can provide ecosystem services as well as disservices. Careful handling of the tradeoffs among multiple ecosystem services, biodiversity, and societal concerns is thus crucial and depends on our ability to predict the functional consequences of landscape-scale changes in trophic interactions.arthropods and birds | biodiversity-ecosystem functioning | biological pest control | ecosystem service provision | land use intensification
Ecologists carry a well-stocked toolbox with a great variety of sampling methods, statistical analyses and modelling tools, and new methods are constantly appearing. Evaluation and optimisation of these methods is crucial to guide methodological choices. Simulating error-free data or taking high-quality data to qualify methods is common practice. Here, we emphasise the methodology of the 'virtual ecologist' (VE) approach where simulated data and observer models are used to mimic real species and how they are 'virtually' observed. This virtual data is then subjected to statistical analyses and modelling, and the results are evaluated against the 'true' simulated data. The VE approach is an intuitive and powerful evaluation framework that allows a quality assessment of sampling protocols, analyses and modelling tools. It works under controlled conditions as well as under consideration of confounding factors such as animal movement and biased observer behaviour. In this review, we promote the approach as a rigorous research tool, and demonstrate its capabilities and practical relevance. We explore past uses of VE in different ecological research fields, where it mainly has been used to test and improve sampling regimes as well as for testing and comparing models, for example species distribution models. We discuss its benefits as well as potential limitations, and provide some practical considerations for designing VE studies. Finally, research fields are identified for which the approach could be useful in the future. We conclude that VE could foster the integration of theoretical and empirical work and stimulate work that goes far beyond sampling methods, leading to new questions, theories, and better mechanistic understanding of ecological systems
In ecology, the true causal structure for a given problem is often not known, and several plausible models and thus model predictions exist. It has been claimed that using weighted averages of these models can reduce prediction error, as well as better reflect model selection uncertainty. These claims, however, are often demonstrated by isolated examples. Analysts must better understand under which conditions model averaging can improve predictions and their uncertainty estimates. Moreover, a large range of different model averaging methods exists, raising the question of how they differ in their behaviour and performance. Here, we review the mathematical foundations of model averaging along with the diversity of approaches available. We explain that the error in model‐averaged predictions depends on each model's predictive bias and variance, as well as the covariance in predictions between models, and uncertainty about model weights. We show that model averaging is particularly useful if the predictive error of contributing model predictions is dominated by variance, and if the covariance between models is low. For noisy data, which predominate in ecology, these conditions will often be met. Many different methods to derive averaging weights exist, from Bayesian over information‐theoretical to cross‐validation optimized and resampling approaches. A general recommendation is difficult, because the performance of methods is often context dependent. Importantly, estimating weights creates some additional uncertainty. As a result, estimated model weights may not always outperform arbitrary fixed weights, such as equal weights for all models. When averaging a set of models with many inadequate models, however, estimating model weights will typically be superior to equal weights. We also investigate the quality of the confidence intervals calculated for model‐averaged predictions, showing that they differ greatly in behaviour and seldom manage to achieve nominal coverage. Our overall recommendations stress the importance of non‐parametric methods such as cross‐validation for a reliable uncertainty quantification of model‐averaged predictions.
Movement of organisms is one of the key mechanisms shaping biodiversity, e.g. the distribution of genes, individuals and species in space and time. Recent technological and conceptual advances have improved our ability to assess the causes and consequences of individual movement, and led to the emergence of the new field of ‘movement ecology’. Here, we outline how movement ecology can contribute to the broad field of biodiversity research, i.e. the study of processes and patterns of life among and across different scales, from genes to ecosystems, and we propose a conceptual framework linking these hitherto largely separated fields of research. Our framework builds on the concept of movement ecology for individuals, and demonstrates its importance for linking individual organismal movement with biodiversity. First, organismal movements can provide ‘mobile links’ between habitats or ecosystems, thereby connecting resources, genes, and processes among otherwise separate locations. Understanding these mobile links and their impact on biodiversity will be facilitated by movement ecology, because mobile links can be created by different modes of movement (i.e., foraging, dispersal, migration) that relate to different spatiotemporal scales and have differential effects on biodiversity. Second, organismal movements can also mediate coexistence in communities, through ‘equalizing’ and ‘stabilizing’ mechanisms. This novel integrated framework provides a conceptual starting point for a better understanding of biodiversity dynamics in light of individual movement and space-use behavior across spatiotemporal scales. By illustrating this framework with examples, we argue that the integration of movement ecology and biodiversity research will also enhance our ability to conserve diversity at the genetic, species, and ecosystem levels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.