International audience1. Models for predicting the distribution of organisms from environmental data are widespread in ecology and conservation biology. Their performance is invariably evaluated from the percentage success at predicting occurrence at test locations. 2. Using logistic regression with real data from 34 families of aquatic invertebrates in 180 Himalayan streams, we illustrate how this widespread measure of predictive accuracy is affected systematically by the prevalence (i.e. the frequency of occurrence) of the target organism. Many evaluations of presence-absence models by ecologists are inherently misleading. 3. With the same invertebrate models, we examined alternative performance measures used in remote sensing and medical diagnostics. We particularly explored receiver-operating characteristic (ROC) plots, from which were derived (i) the area under each curve (AUC), considered an effective indicator of model performance independent of the threshold probability at which the presence of the target organism is accepted, and (ii) optimized probability thresholds that maximize the percentage of true absences and presences that are correctly identified. We also evaluated Cohen's kappa, a measure of the proportion of all possible cases of presence or absence that are predicted correctly after accounting for chance effects. 4. AUC measures from ROC plots were independent of prevalence, but highly significantly correlated with the much more easily computed kappa. Moreover, when applied in predictive mode to test data, models with thresholds optimized by ROC erroneously overestimated true occurrence among scarcer organisms, often those of greatest conservation interest. We advocate caution in using ROC methods to optimize thresholds required for real prediction. 5. Our strongest recommendation is that ecologists reduce their reliance on prediction success as a performance measure in presence-absence modelling. Cohen's kappa provides a simple, effective, standardized and appropriate statistic for evaluating or comparing presence-absence models, even those based on different statistical algorithms. None of the performance measures we examined tests the statistical significance of predictive accuracy, and we identify this as a priority area for research and development
Times Cited: 83International audienceAssignment methods, which use genetic information to ascertain population membership of individuals or groups of individuals, have been used in recent years to study a wide range of evolutionary and ecological processes. In applied studies, the first step of articulating the biological question(s) to be addressed should be followed by selection of the method(s) best suited for the analysis. However, this first step often receives less attention than it should, and the recent proliferation of assignment methods has made the selection step challenging. Here, we review assignment methods and discuss how to match the appropriate methods with the underlying biological questions for several common problems in ecology and conservation (assessing population structure; measuring dispersal and hybridization; and forensics and mixture analysis). We also identify several topics for future research that should ensure that this field remains dynamic and productive
Recently, the amplified fragment length polymorphism (AFLP) technique has gained a lot of popularity, and is now frequently applied to a wide variety of organisms. Technical specificities of the AFLP procedure have been well documented over the years, but there is on the contrary little or scattered information about the statistical analysis of AFLPs. In this review, we describe the various methods available to handle AFLP data, focusing on four research topics at the population or individual level of analysis: (i) assessment of genetic diversity; (ii) identification of population structure; (iii) identification of hybrid individuals; and (iv) detection of markers associated with phenotypes. Two kinds of analysis methods can be distinguished, depending on whether they are based on the direct study of band presences or absences in AFLP profiles ('band-based' methods), or on allelic frequencies estimated at each locus from these profiles ('allele frequency-based' methods). We investigate the characteristics and limitations of these statistical tools; finally, we appeal for a wider adoption of methodologies borrowed from other research fields, like for example those especially designed to deal with binary data.
Thanks to genome-scale diversity data, present-day studies can provide a detailed view of how natural and cultivated species adapt to their environment and particularly to environmental gradients. However, due to their sensitivity, up-to-date studies might be more sensitive to undocumented demographic effects such as the pattern of migration and the reproduction regime. In this study, we provide guidelines for the use of popular or recently developed statistical methods to detect footprints of selection. We simulated 100 populations along a selective gradient and explored different migration models, sampling schemes and rates of self-fertilization. We investigated the power and robustness of eight methods to detect loci potentially under selection: three designed to detect genotype-environment correlations and five designed to detect adaptive differentiation (based on F(ST) or similar measures). We show that genotype-environment correlation methods have substantially more power to detect selection than differentiation-based methods but that they generally suffer from high rates of false positives. This effect is exacerbated whenever allele frequencies are correlated, either between populations or within populations. Our results suggest that, when the underlying genetic structure of the data is unknown, a number of robust methods are preferable. Moreover, in the simulated scenario we used, sampling many populations led to better results than sampling many individuals per population. Finally, care should be taken when using methods to identify genotype-environment correlations without correcting for allele frequency autocorrelation because of the risk of spurious signals due to allele frequency correlations between populations.
Local adaptations can determine the potential of populations to respond to environmental changes, yet adaptive genetic variation is commonly ignored in models forecasting species vulnerability and biogeographical shifts under future climate change. Here we integrate genomic and ecological modeling approaches to identify genetic adaptations associated with climate in two cryptic forest bats. We then incorporate this information directly into forecasts of range changes under future climate change and assessment of population persistence through the spread of climate-adaptive genetic variation (evolutionary rescue potential). Considering climate-adaptive potential reduced range loss projections, suggesting that failure to account for intraspecific variability can result in overestimation of future losses. On the other hand, range overlap between species was projected to increase, indicating that interspecific competition is likely to play an important role in limiting species’ future ranges. We show that although evolutionary rescue is possible, it depends on a population’s adaptive capacity and connectivity. Hence, we stress the importance of incorporating genomic data and landscape connectivity in climate change vulnerability assessments and conservation management.
Landscape genetics plays an increasingly important role in the management and conservation of species. Here, we highlight some of the opportunities and challenges in using landscape genetic approaches in conservation biology. We first discuss challenges related to sampling design and introduce several recent methodological developments in landscape genetics (analyses based on pairwise relatedness, the application of Bayesian methods, inference from landscape resistance and a shift from population-based to individual-based analyses). We then show how simulations can foster the field of landscape genetics and, finally, elaborate on technical developments in sequencing techniques that will dramatically improve our ability to study genetic variation in wild species, opening up new and unprecedented avenues for genetic analysis in conservation biology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.