Biodiversity assessments use a variety of data and models. We propose best-practice standards for studies in these assessments.
Predictive models are central to many scientific disciplines and vital for informing management in a rapidly changing world. However, limited understanding of the accuracy and precision of models transferred to novel conditions (their 'transferability') undermines confidence in their predictions. Here, 50 experts identified priority knowledge gaps which, if filled, will most improve model transfers. These are summarized into six technical and six fundamental challenges, which underlie the combined need to intensify research on the determinants of ecological predictability, including species traits and data quality, and develop best practices for transferring models. Of high importance is the identification of a widely applicable set of transferability metrics, with appropriate tools to quantify the sources and impacts of prediction uncertainty under novel conditions. Predicting the UnknownPredictions facilitate the formulation of quantitative, testable hypotheses that can be refined and validated empirically [1]. Predictive models have thus become ubiquitous in numerous scientific disciplines, including ecology [2], where they provide means for mapping species distributions, explaining population trends, or quantifying the risks of biological invasions and disease outbreaks (e.g., [3,4]). The practical value of predictive models in supporting policy and decision making has therefore grown rapidly (Box 1) [5]. With that has come an increasing desire to predict (see Glossary) the state of ecological features (e.g., species, habitats) and our likely impacts upon them [5], prompting a shift from explanatory models to anticipatory predictions [2]. However, in many situations, severe data deficiencies preclude the development of specific models, and the collection of new data can be prohibitively costly or simply impossible [6]. It is in this context that interest in transferable models (i.e., those that can be legitimately projected beyond the spatial and temporal bounds of their underlying data [7]) has grown.Transferred models must balance the tradeoff between estimation and prediction bias and variance (homogenization versus nontransferability, sensu [8]). Ultimately, models that can Highlights Models transferred to novel conditions could provide predictions in data-poor scenarios, contributing to more informed management decisions.The determinants of ecological predictability are, however, still insufficiently understood.Predictions from transferred ecological models are affected by species' traits, sampling biases, biotic interactions, nonstationarity, and the degree of environmental dissimilarity between reference and target systems.We synthesize six technical and six fundamental challenges that, if resolved, will catalyze practical and conceptual advances in model transfers.We propose that the most immediate obstacle to improving understanding lies in the absence of a widely applicable set of metrics for assessing transferability, and that encouraging the development of models grounded in well-established mech...
Logistic regression is a statistical tool widely used for predicting species' potential distributions starting from presence/absence data and a set of independent variables. However, logistic regression equations compute probability values based not only on the values of the predictor variables but also on the relative proportion of presences and absences in the dataset, which does not adequately describe the environmental favourability for or against species presence. A few strategies have been used to circumvent this, but they usually imply an alteration of the original data or the discarding of potentially valuable information. We propose a way to obtain from logistic regression an environmental favourability function whose results are not affected by an uneven proportion of presences and absences. We tested the method on the distribution of virtual species in an imaginary territory. The favourability models yielded similar values regardless of the variation in the presence/absence ratio. We also illustrate with the example of the Pyrenean desman's (Galemys pyrenaicus) distribution in Spain. The favourability model yielded more realistic potential distribution maps than the logistic regression model. Favourability values can be regarded as the degree of membership of the fuzzy set of sites whose environmental conditions are favourable to the species, which enables applying the rules of fuzzy logic to distribution modelling. They also allow for direct comparisons between models for species with different presence/absence ratios in the study area. This makes them more useful to estimate the conservation value of areas, to design ecological corridors, or to select appropriate areas for species reintroductions.
Bonelli's eagle, Hieraaetus fasciatus , has recently suffered a severe population decline and is currently endangered. Spain supports about 70% of the European population. We used stepwise logistic regression on a set of environmental, spatial and human variables to model Bonelli's eagle distribution in the 5167 UTM 10 × 10 km quadrats of peninsular Spain. We obtained a model based on 16 variables, which allowed us to identify favourable and unfavourable areas for this species in Spain, as well as intermediate favourability areas. We assessed the stepwise progression of the model by comparing the model's predictions in each step with those of the final model, and selected a parsimonious explanatory model based on three variables -slope, July temperature and precipitation -comprising 76% of the predictive capacity of the final model. The reported presences in favourable and unfavourable areas suggest a source-sink dynamics in Bonelli's eagle populations. The fragmented spatial structure of the favourable areas suggests the existence of a superimposed metapopulation dynamics. Previous LIFE (The Financial Instrument of the European Union for the Environment and Nature) projects for the conservation of this species have focused mainly on the northern limit of its range, where the sharpest population decline has been recorded. In these areas, favourability is low and Bonelli's eagle populations are probably maintained by the immigration of juveniles produced in more favourable zones. However, southern populations, although stable, show signs of reduction in productivity, which could menace the population sizes in the whole study area. We suggest that conservation efforts should focus also on known favourable areas, which might favour population persistence in unfavourable areas through dispersal.
Aim When faced with dichotomous events, such as the presence or absence of a species, discrimination capacity (the ability to separate the instances of presence from the instances of absence) is usually the only characteristic that is assessed in the evaluation of the performance of predictive models. Although neglected, calibration or reliability (how well the estimated probability of presence represents the observed proportion of presences) is another aspect of the performance of predictive models that provides important information. In this study, we explore how changes in the distribution of the probability of presence make discrimination capacity a context-dependent characteristic of models. For the first time, we explain the implications that ignoring the context dependence of discrimination can have in the interpretation of species distribution models. InnovationIn this paper we corroborate that, under a uniform distribution of the estimated probability of presence, a well-calibrated model will not attain high discrimination power and the value of the area under the curve will be 0.83. Under non-uniform distributions of the probability of presence, simulations show that a well-calibrated model can attain a broad range of discrimination values. These results illustrate that discrimination is a context-dependent property, i.e. it gives information about the performance of a certain algorithm in a certain data population. Main conclusionsIn species distribution modelling, the discrimination capacity of a model is only meaningful for a certain species in a given geographic area and temporal snapshot. This is because the representativeness of the environmental domain changes with the geographical and temporal context, which unavoidably entails changes in the distribution of the probability of presence. Comparative studies that intend to generalize their results only based on the discrimination capacity of models may not be broadly extrapolated. Assessment of calibration is especially recommended when the models are intended to be transferred in time or space.
Summary1. Binary similarity indices are widely used in ecology, for example for detecting associations between species occurrence patterns, comparing regional and temporal species assemblages, and assessing beta diversity patterns, including spatial and temporal species loss and turnover. Such indices have widespread applications in biogeography, global change biology and biodiversity conservation. 2. Similarity indices are commonly calculated upon binary presence/absence (or sometimes modelled suitable/ unsuitable) data, which are generally incomplete and more categorical than their underlying natural patterns. Probable false absences are disregarded, amplifying the effects of data deficiencies and the scale dependence of the results. 3. Fuzzy occurrence data, with a degree of uncertainty attributed to localities where presence or absence cannot be safely assigned, could better reflect species distributions, compensating for incomplete knowledge and methodological errors. Similarity indices would therefore also benefit from accommodating such fuzzy data directly. 4. This study proposes fuzzy versions of the binary similarity indices most commonly used in ecology, so that they can be directly applied to continuous (fuzzy) rather than binary occurrence values, thus producing more realistic similarity assessments. Fuzzy occurrence can be obtained with several methods, some of which are also provided. The procedure is robust to data source disparities, gaps or other errors in species occurrence records, even for restricted species for which slight inaccuracies can affect substantial parts of their range. 5. The method is implemented in a free and open-source software package, fuzzySim, which is available for the R statistical software and under implementation for the QGIS geographic information system. It is provided with sample data and an illustrated tutorial suitable for non-experienced users.
a b s t r a c tTransferring distribution models between different geographical areas may be problematic, as the performance of models outside their original scope is hard to predict. A modelling procedure is needed that gets the gist of the environmental descriptors of a distribution area, without either overfitting to the training data or overestimating the species' distribution potential. We tested the transferability power of the favourability function, a generalized linear model, on the distribution of the Iberian desman (Galemys pyrenaicus) in the Iberian territories of Portugal and Spain. We also tested the effects of two of the main potential constraints on model transferability: the analysed ranges of the predictor variables, and the completeness of the species distribution data. We modelled 10 km × 10 km presence/absence data from Portugal and Spain separately, extrapolated each model to the other country, and compared predictions with observations. The Spanish model, despite arguably containing more false absences, showed good predictive ability in Portugal. The Portuguese model, whose predictors ranged between only a subset of the values observed in Spain, overestimated desman distribution when transferred. We discuss possible reasons for this differential model behaviour, and highlight the importance of this kind of models for prediction and conservation applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.