2020
DOI: 10.1111/2041-210x.13525
|View full text |Cite
|
Sign up to set email alerts
|

Spatial thinning and class balancing: Key choices lead to variation in the performance of species distribution models with citizen science data

Abstract: Spatial biases are a common feature of presence–absence data from citizen scientists. Spatial thinning can mitigate errors in species distribution models (SDMs) that use these data. When detections or non‐detections are rare, however, SDMs may suffer from class imbalance or low sample size of the minority (i.e. rarer) class. Poor predictions can result, the severity of which may vary by modelling technique. To explore the consequences of spatial bias and class imbalance in presence–absence data, we used eBird … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
54
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 69 publications
(67 citation statements)
references
References 55 publications
0
54
0
Order By: Relevance
“…While many SDMs are based on PO data similar to ours owing to spatial thinning where occurrences are removed to reduce the effect of spatial sampling bias on parameter estimates (see Steen et al. [2020] for a review), multiple occurrences per coarse‐grain pixels are frequently considered in the context of data integration using a Poisson point process (PPP; Dorazio 2014, Fithian et al. 2015 a , Fletcher et al.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…While many SDMs are based on PO data similar to ours owing to spatial thinning where occurrences are removed to reduce the effect of spatial sampling bias on parameter estimates (see Steen et al. [2020] for a review), multiple occurrences per coarse‐grain pixels are frequently considered in the context of data integration using a Poisson point process (PPP; Dorazio 2014, Fithian et al. 2015 a , Fletcher et al.…”
Section: Methodsmentioning
confidence: 99%
“…They are thus different from classical PO data obtained from databases of opportunistic records (e.g., GBIF) where multiple occurrences can occur in a given cell. While many SDMs are based on PO data similar to ours owing to spatial thinning where occurrences are removed to reduce the effect of spatial sampling bias on parameter estimates (see Steen et al [2020] for a review), multiple occurrences per coarse-grain pixels are frequently considered in the context of data integration using a Poisson point process (PPP; Dorazio 2014, Fithian et al 2015a, Fletcher et al 2016, Koshkina et al 2017. Although we here focus on PO data obtained from IUCN range maps or atlases where only one occurrence per coarse-grain cell is considered, the PO-based models presented below are based on a PPP and are therefore also valid with multiple occurrences.…”
Section: Data Simulationsmentioning
confidence: 99%
“…birds), and a non-random skew of observations towards anthropogenic land-uses ('road-side bias'), whether the focus be on any, rare or non-native species. Thus, the study by Petersen et al (2021) provides useful context for the remaining three papers under the theme of quality control which consider how to account for different forms of bias generated through citizen science data related to site selection and observer retention over time (Dambly et al, 2021), artificial light conditions (Ditmer et al, 2021), and choice of methods in generating less biased SDMs (Steen et al, 2021).…”
Section: Quality Assurance and Controlmentioning
confidence: 99%
“…However, various methods of bias correction have been suggested and combined in the literature. For example, spatial thinning of species observations according to the study area’s resolution (Aiello‐Lammens et al., 2015; Kiedrzyński et al., 2017; Steen et al., 2020), model‐based corrections (Komori et al., 2020; Stolar & Nielsen, 2015), or combining species observations with large survey data (Fithian et al., 2015; Fletcher et al., 2016) have been recommended. Yet overall, sampling background data with the target‐group strategy—that is, using similar sampling design/bias to sample background points as observed species presence points—remains currently the most popular approach (Botella et al., 2020; Hertzog et al., 2014; Kramer‐Schadt et al., 2013; Phillips et al., 2009; Righetti et al., 2019).…”
Section: Introductionmentioning
confidence: 99%