Summary1. Including predictors in species distribution models at inappropriate spatial scales can decrease the variance explained, add residual spatial autocorrelation (RSA) and lead to the wrong conclusions. Some studies have measured predictors within different buffer sizes (scales) around sample locations, regressed each predictor against the response at each scale and selected the scale with the best model fit as the appropriate scale for this predictor. However, a predictor can influence a species at several scales or show several scales with good model fit due to a bias caused by RSA. This makes the evaluation of all scales with good model fit necessary. With potentially several scales per predictor and multiple predictors to evaluate, the number of predictors can be large relative to the number of data points, potentially impeding variable selection with traditional statistical techniques, such as logistic regression. 2. We trialled a variable selection process using the random forest algorithm, which allows the simultaneous evaluation of several scales of multiple predictors. Using simulated responses, we compared the performance of models resulting from this approach with models using the known predictors at arbitrary and at the known spatial scales. We also apply the proposed approach to a real data set of curlew (Numenius arquata). 3. AIC, AUC and Naglekerke's pseudo R 2 of the models resulting from the proposed variable selection were often very similar to the models with the known predictors at known spatial scales. Only two of nine models required the addition of spatial eigenvectors to account for RSA. Arbitrary scale models always required the addition of spatial eigenvectors. 75% (50-100%) of the known predictors were selected at scales similar to the known scale (within 3 km). In the curlew model, predictors at large, medium and small spatial scales were selected, suggesting that for appropriate landscape-scale models multiple scales need to be evaluated. 4. The proposed approach selected several of the correct predictors at appropriate spatial scales out of 544 possible predictors. Thus, it facilitates the evaluation of multiple spatial scales of multiple predictors against each other in landscape-scale models.
Opportunistically collected species observations contributed by volunteer reporters are increasingly available for species and regions for which systematically collected data are not available. However, it is unclear if they are suitable to produce reliable habitat suitability models (HSMs), and hence if the species–habitat relationships found and habitat suitability maps produced can be used with confidence to advice conservation management and address basic and applied research questions. We evaluated HSMs with opportunistically collected observations against HSMs with systematically collected observations. We enhanced the opportunistically collected presence‐only data by adding inferred species absences. To obtain inferred absences, we asked individual reporters about their identification skills and if they reported certain species consistently and combined this information with their observations. We evaluated several HSM methods using a forest bird species, Siberian jay (Perisoreus infaustus), in Sweden: logistic regression with inferred absences, two versions of MaxEnt, a model combining presence–absence with presence‐only observations and a Bayesian site‐occupancy‐detection model. All HSM methods produced nationwide habitat suitability maps of Siberian jay that agreed well with systematically collected observations (AUC: 086–0.88) and were very similar to a habitat suitability map produced from the HSM with systematically collected observations (Spearman rho: 0.94–0.98). At finer geographical scales there were differences among methods. At finer scale, the resulting habitat suitability maps from logistic regression with inferred absences agreed better with results from systematically collected observations than other methods. The species–habitat relationships found with logistic regression also agreed well with those found from systematically collected data and with prior expectations based on the species ecology. Synthesis and application. For many regions and species, systematically collected data are not available. By using inferred absences from high‐quality, opportunistically collected contributions of few very active reporters in logistic regression we obtained HSMs that produced results similar to those from a systematic survey. Adding high‐quality inferred absences to opportunistically collected data is likely possible for many less common species across various organism groups. Well‐performing HSMs are important to facilitate applications such as spatial conservation planning and prioritization, monitoring of invasive species, understanding species habitat requirements or climate change studies.
Summary1. High-resolution vegetation maps are a valuable resource for conservation, land management and research. In Great Britain, the National Vegetation Classification (NVC) is widely used to describe vegetation communities. NVC maps are typically produced from ground surveys which are prohibitively expensive for large areas. An approach to produce NVC maps more cost-effectively for large areas would be valuable. 2. Creation of vegetation community maps from aerial or satellite images has often had limited success as the clusters separable by spectral reflectance frequently do not correspond well to vegetation community classes. Such maps have also been produced by exploring correlations between community occurrence and environmental variables. The latter approach can have limitations where anthropogenic activities have altered the distribution of vegetation communities. We combined these two approaches and classified 24 common NVC classes of the Yorkshire Dales and an additional class 'wood' consisting of trees and bushes at a resolution of 5 m from mostly remotely sensed variables with the algorithm random forest. 3. Classification accuracy was highest when environmental variables at low and high resolution (50 and 5-10 m, respectively) were added to aerial image information aggregated to a resolution of 5 m. Low-resolution environmental variables are likely to be correlated with the dominant vegetation surrounding a location and thus could represent critical area requirements or local species pool effects, while high-resolution environmental variables represent the environmental conditions at the location. 4. Overall classification accuracy was 87-92%. The median producer's and user's class accuracies were 95% (58-100%) and 92% (67-100%), respectively. 5. Synthesis and applications. The classification accuracies achieved in this study, the number of classes differentiated, their level of detail and the resolution were high compared with those of other studies. This approach could allow the production of good-quality NVC maps for large areas. In contrast to existing maps of broad land cover types, such maps would provide more detailed vegetation community data for applications like the monitoring of vegetation in a changing climate, the study of animal-habitat relationships, conservation management or land use planning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.