2023
DOI: 10.1038/s41597-023-02626-w
|View full text |Cite
|
Sign up to set email alerts
|

A globally synthesised and flagged bee occurrence dataset and cleaning workflow

James B. Dorey,
Erica E. Fischer,
Paige R. Chesshire
et al.

Abstract: Species occurrence data are foundational for research, conservation, and science communication, but the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeBDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >18.3 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, dedup… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
18
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 14 publications
(18 citation statements)
references
References 49 publications
0
18
0
Order By: Relevance
“…The bee data collection community faces obstacles such as the complexities of sampling a diverse group of small and highly mobile species, taxonomic challenges that are exacerbated by lack of funding and support (Gonzalez et al, 2013; Woodard et al, 2020), and bee identification and digitization backlogs. These challenges are being met (Cobb et al, 2019; Seltmann et al, 2021; Chesshire et al, 2023; Dorey et al, 2023) but have almost certainly contributed to a relative lack of data to date. We also note that some major efforts to collect bee data across the U.S., such as through the USGS Native Bee Inventory and Monitoring Program, have only recently uploaded a complete version of their records, while others, such as state atlas projects, will yield more, and higher-quality, data in the coming years.…”
Section: Discussionmentioning
confidence: 99%
“…The bee data collection community faces obstacles such as the complexities of sampling a diverse group of small and highly mobile species, taxonomic challenges that are exacerbated by lack of funding and support (Gonzalez et al, 2013; Woodard et al, 2020), and bee identification and digitization backlogs. These challenges are being met (Cobb et al, 2019; Seltmann et al, 2021; Chesshire et al, 2023; Dorey et al, 2023) but have almost certainly contributed to a relative lack of data to date. We also note that some major efforts to collect bee data across the U.S., such as through the USGS Native Bee Inventory and Monitoring Program, have only recently uploaded a complete version of their records, while others, such as state atlas projects, will yield more, and higher-quality, data in the coming years.…”
Section: Discussionmentioning
confidence: 99%
“…Because records with coordinates from the Snow Entomological Museum, University of Kansas, and the US National Museum had been included by Kitnya et al (2020) and most other GBIF records either lack coordinates or identifiable photos of the bees, we only obtained five new records from GBIF, all from the citizen science database Observation.org (Observation, 2023). Following the example of Dorey et al (2023), we also checked the Symbiota Collections of Arthropods Network (SCAN, 2023); it contained only the records for specimens housed in the Snow Entomological Museum which already had been retrieved from the GBIF database. iDigBio (2023) lacked records for this species.…”
Section: Methodsmentioning
confidence: 99%
“…To qualify species–habitat associations, studies often rely on expert‐collected (EC) data. However, such data have severe limitations, particularly in Southeast Asia, where there is a substantial data gap (Feeley & Silman, 2011) including for bees (Dorey et al., 2023; Orr et al., 2021; Warrit et al., 2023). Consequently, community science (CS) data have huge potential to contribute to ecological studies, particularly where professionally collected data are lacking or unfeasible to collect at large scales (Brown & Williams, 2019; Theobald et al., 2015).…”
Section: Introductionmentioning
confidence: 99%