Completeness analysis for over 3000 United States bee species identifies persistent data gap

Chesshire, Paige R.; Fischer, Erica E.; Dowdy, Nicolas J.; Griswold, Terry; Hughes, Alice C.; Orr, Michael C.; Ascher, John S.; Guzman, Laura Melissa; Hung, Keng‐Lou James; Cobb, Neil S.; McCabe, Lindsie M.

doi:10.1111/ecog.06584

Cited by 28 publications

(38 citation statements)

References 47 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Collecting data in the de cient classes should therefore be among the priorities for new collections. New data collections that integrate community science data with specimen-based documentation and development of societal initiatives like citizen science can help achieve that purpose [32,33,34]. Many medicinal plants, microalgae, and phytoplankton belong to the underrepresented plant classes.…”

Section: -Discussion and Conclusionmentioning

confidence: 99%

“…Opening of new roads, with more representativeness of ecological conditions of the landscapes across African countries, can signi cantly contribute to less biased data collections. Addressing less charismatic species and developing societal initiatives like citizen science can also contribute to less biased data collection [32]. More incentives in data publication and more intense connections between researchers can also contribute to lling gaps of information [10].…”

Section: 4-impact Of Accessibility and Protected Areas On Data Comple...mentioning

confidence: 99%

See 1 more Smart Citation

Biodiversity conservation at the digital age, will Africa meet the challenge?

Ganglo

2022

Preprint

View full text Add to dashboard Cite

Digital accessible Knowledge (DAK) is of utmost importance for biodiversity conservation; indeed, their use is indispensable to provide evidence and strategies to support decision-making on natural resource management and sustainable use. The Global Biodiversity Information Facility (GBIF, www.gbif.org) is a mega data infrastructure with more than two billion occurrence records as of 28th May 2022. It is by far the largest initiative assembling and sharing DAK to support scientific research, conservation, and sustainable development. We decided to analyze plant data published on GBIF site at the scale of Africa. This will highlight the contribution of the continent to GBIF and thereby underline data gaps across taxonomic groups, basis of records, and geographic space. In order to achieve our purpose, we downloaded data of the Plantae kingdom from Africa. They are available at https://doi.org/10.15468/dl.f79228. We achieved data treatment and analysis using R, several packages and related functions. Although Africa is home to a rich biodiversity with many hotspots, the global data contribution of Africa to GBIF is still incredibly low (1.37%). Furthermore, there are huge disparities between African countries with South Africa contributing alone for 65% of the data of the continent. The plant data of Africa (2,713,790 occurrence records) accounted for 9.11% of the data of the continent; this underlines huge gaps between taxonomic groups. Furthermore, the Magnoliopsida is the dominant plant class with the highest number of records (79.62%) and the highest number of species (71.85%) followed by the Liliopsida with 15.10% of the records and 18.16% of the species. Two basis of records were dominant: preserved specimens (75.49%) and human observation (18.60%). In geographic space, plant data gaps are also quite huge across the continent at either spatial resolution (half degree or one degree spatial grid cells); data completeness is more achieved in West Africa, East Africa, Southern Africa, and Madagascar. The huge multidimensional data gaps identified in this study should be in priority addressed in the future data collections. Accessibility either by roads or waterways and protected areas are underpinning factors of data completeness across the continent. We deplored important data loss during the process of data cleaning; indeed the total number of records with adequate coordinates accounted for 71.03% of the initial data while the data fitness for use in completeness analysis (records with adequate coordinates and full dates) are only about 65% of the total data records initially downloaded.

show abstract

Section: -Discussion and Conclusionmentioning

confidence: 99%

Section: 4-impact Of Accessibility and Protected Areas On Data Comple...mentioning

confidence: 99%

Biodiversity conservation at the digital age, will Africa meet the challenge?

Ganglo

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Where public and private data were duplicated, we gave preference to private data providers over the public data aggregators under the assumption that data providers have the most recent information. We also preferred manually cleaned occurrence records from Chesshire, et al 93 over those sourced directly from data aggregators. All pairwise duplicates were clustered where they overlapped and a single best occurrence was kept using the above arrangement.…”

Section: Duplicate Recordsmentioning

confidence: 99%

“…InputData. Contains the major repository downloads, additional input datasets, and custom Chesshire, et al 93 data files. 4.…”

Section: Data Recordsmentioning

confidence: 99%

“…Contains the “cleaned” (05_cleaned_database.csv) and “flagged-but-uncleaned” (05_unCleaned_database.csv) datasets, reports, and the R console outputs from the script (RunNotes_BeeBDC_1Sep23.txt). InputData. Contains the major repository downloads, additional input datasets, and custom Chesshire, et al 93 data files. ExtraTables. Contains metadata for the major repository downloads (MajorRepoAttributes_2023-09-01.xlsx) and the added taxonomy variants (AddedTaxonomyVariants.xlsx). The beesTaxonomy.Rda and beesChecklist.Rda datasets that are downloaded using BeeBDC (see 10.…”

Section: Data Recordsmentioning

confidence: 99%

See 1 more Smart Citation

A globally synthesised and flagged bee occurrence dataset and cleaning workflow

Dorey,

Fischer,

Chesshire

et al. 2023

Preprint

Self Cite

View full text Add to dashboard Cite

Species occurrence data are foundational for research, conservation, and science communication. But the limited availability and accessibility of reliable data represents a major obstacle, particularly for insects, which face mounting pressures. We present BeeDC, a new R package, and a global bee occurrence dataset to address this issue. We combined >17.7 million bee occurrence records from multiple public repositories (GBIF, SCAN, iDigBio, USGS, ALA) and smaller datasets, then standardised, flagged, deduplicated, and cleaned the data using the reproducible BeeDC R-workflow. Specifically, we harmonised species names following established global taxonomy, country, and collection date and we added record-level flags for a series of potential quality issues. These data are provided in two formats, "completely-cleaned" and "flagged-but-uncleaned". Our data cleaning process is open and documented for transparency and reproducibility. The BeeDC package and R Markdown are provided, and will be improved and updated regularly. By publishing reproducible R workflows and globally cleaned datasets we can increase the accessibility and reliability of downstream analyses. This workflow can be implemented for other taxa to support research and conservation.

show abstract

Fewer bowl traps and more hand netting can increase effective number of bee species and reduce excessive captures

Larson,

Pennarola,

Leone

et al. 2024

Ecology and Evolution

View full text Add to dashboard Cite

Reports increasingly point to substantial declines in wild bee abundance and diversity, yet there is uncertainty about how best to measure these attributes in wild bee populations. Two commonly used methods are passive trapping with bee bowls or active netting of bees on flowers, but each of these has drawbacks. Comparing the outcomes of the two methods is complicated by their uncomparable units of effort. The abundance distribution of bee species is also typically highly skewed, making it difficult to accurately assess diversity when rarer species are unlikely to be caught. The effective number of species, or Hill numbers, provides a way forward by basing the response metric on the number of equally abundant species. Our goal is to compare the effective number of bee species captured between hand netting and bowl trapping in wheatgrass prairie in South Dakota and tallgrass prairie in Minnesota, USA. Species overlap between the two methods ranged from ~40% to ~60%. Emphasis placed on rare species was important, so that 95% confidence limits overlapped between the two methods for species richness but netting exceeded trapping for Shannon's and Simpson's diversities. Netting always captured more bee species with fewer bee individuals than trapping. In most cases, the number of bees captured in bowl traps indicated substantial over‐sampling, with little increase in bee species detected. Correlations between bee and floral abundance, richness, and diversity differed between netted and trapped samples. We conclude that netting and trapping together produce a more complete account of species richness, but shifting sampling emphasis from trapping to netting will result in fewer bees, but more bee species captured. Due to the different relationships between bee and floral diversities that depended on sampling method, it is unwise to compare habitat associations determined by netting with those determined by trapping.

show abstract

Completeness analysis for over 3000 United States bee species identifies persistent data gap

Cited by 28 publications

References 47 publications

Biodiversity conservation at the digital age, will Africa meet the challenge?

Biodiversity conservation at the digital age, will Africa meet the challenge?

A globally synthesised and flagged bee occurrence dataset and cleaning workflow

Fewer bowl traps and more hand netting can increase effective number of bee species and reduce excessive captures

Contact Info

Product

Resources

About