The paper presents the initiative on literature-based occurrence data mobilisation of fungi and fungi-related organisms (literature-based occurrences, Darwin Core MaterialCitation) to develop the Fungal literature-based occurrence database for the southern West Siberia (FuSWS). The initiative on mobilisation of literature-based occurrence data started in the northern part of West Siberia in 2016. The present project extends the initiative to the southern regions and includes ten administrative territories (Tyumen Region, Sverdlovsk Region, Chelyabinsk Region, Omsk Region, Kurgan Region, Tomsk Region, Novosibirsk Region, Kemerovo Region, Altai Territory and Republic of Altai). The area occupies the central to southern part of the West Siberian Plain and extends for about 1.5 K km from the west to the east from the eastern slopes of the Ural Mountains to Yenisey River and from north to south—about 1.3 K km. The total area equals about 1.4 million km2. The initiative is actively growing in spatial, collaboration and data accumulation terms. The working group of about 30 mycologists from eight organisations dedicated to the data mobilisation was created as part of the Siberian Mycological Society (informal organisation since 2019). They have compiled the almost complete bibliographic list of mycology-related papers for the southern West Siberia, including over 900 publications for the last two centuries (the earliest dated 1800). All literature sources were digitised and an online library was created to integrate bibliography metadata and digitised papers using Zotero bibliography manager. The analysis of published sources showed that about two-thirds of works contain occurrences of fungi for the scope of mobilisation. At the time of the paper submission, the database had been populated with a total of about 8 K records from 93 sources. The dataset is uploaded to GBIF, where it is available for online search of species occurrences and/or download. The project's page with the introduction, templates, bibliography list, video-presentations and written instructions is available (in Russian) at the web site of the Siberian Mycological Society. The initiative will be continued in the following years to extract the records from all published sources. The paper presents the first project with the aim of literature-based occurrence data mobilisation of fungi and fungi-related organisms in the southern West Siberia. The full bibliography and a digital library of all regional mycological publications created for the first time includes about 900 published works. By the time of paper submission, nearly 8 K occurrence records were extracted from about 90 literature sources and integrated into the FuSWS database published in GBIF.
The authors of this paper summarize the majority of published data on the distribution of agaricoid and boletoid fungi recorded in Russia, covering the period from 1824 through 2020. A comprehensive list of 6867 scientific names based on 954 publications was compiled for the first time for the whole territory of Russia. All records have been checked through Index Fungorum. The work consists of a review section and five appendices. The review section discusses the intensity of field research and accumulation of data on the distribution of agaricoid and boletoid fungi in Russia, both historically and in its current state. The authors discuss the current state of knowledge on the biodiversity of regions of Russia and point out blank spots, thus providing a reference and an “action plan” for the future. Appendix A presents a list of 6142 taxa unambiguously ascribed to 3246 accepted current names. Appendix B contains 727 names that cannot be ascribed to any accepted current names unequivocally, with reasons given (e.g., no current name, wrong authors’ citations, absence from Index Fungorum). Names from both checklists are supplemented with data on the distribution of these taxa within the Russian Federation and references to published records. Appendix C contains a list of accepted current names reported from only one region. Appendix D is an overview of the nearly 200 years of research of agaricoid and boletoid fungi for all regions of Russia. Appendix E is a list of references used for checklists and study history preparation.
The abstract presents the initiative to develop the Fungal Literature-based Occurrence Database for Southern West Siberia (FuSWS), which mobilizes occurrences of fungi from published literature (literature-based occurrences, Darwin Core MaterialCitation). The FuSWS database includes 28 fields describing species name, publication source, herbarium number (if exists), date of sampling or observation, locality information, vegetation, substrate, and others. The initiative on digitization of literature-based occurrence data started in the northern part of Western Siberia two years ago (Filippova et al. 2021a). The present project extends the initiative to the south and includes eight administrative regions (Sverdlovsk, Omsk, Kurgan, Tomsk, Novosibirsk, Kemerovo, Altay, and Gorny Altay). The area occupies the central to southern part of the West Siberian Plain. It extends for about 1.5 thousand km from the west to the east from the eastern slopes of the Ural Mountains to Yenisey River, and from north to south—about 1.3 thousand km. The total area equals about 1.2 million km2. Currently, the project is actively growing in spatial, collaboration and data accumulation terms. The working group of about 30 mycologists from 16 organizations dedicated to the digitization initiative was created as part of the Siberian Mycological Society (informal organization since 2019). They have created the most complete bibliographic list of mycology-related papers for the Southern West Siberia, including over 800 publications for the last two centuries (the earliest dated 1800). At abstract submission, the database had been populated with a total of about 10K records from about 100 sources. The dataset is uploaded to GBIF, where it is available for online search of species occurrences and/or download (Filippova et al. 2021b) Fig. 1. The project's page with the introduction, templates, bibliography list, video-presentations and written instructions is available at the website of the Siberian Mycological Society (https://sibmyco.org/literaturedatabase). The following protocol describes the digitization workflow in detail: The bibliography of related publications is compiled using Zotero bibliographic manager. Only published works (peer-reviewed papers, conference proceedings, PhD theses, monographs or book chapters) are selected. If possible, the sources are digitized and added to the library as PDF files. The template of the FuSWS database is made with Google Sheets, which allows simultaneous use by several specialists, in a common data format provided. The simple Microsoft Excel template is also available for the offline databasing. The Darwin Core standard is applied to the database field structure to accommodate the relevant information extracted from the publications. From the available bibliography of publications related to the region, only works with species occurrences are selected for the databasing purpose. The main source of occurrences is annotated species lists with exact localities of the records. However, different sorts of other species citations are also extracted, provided that they had the connection to any geography. All occurrences are georeferenced, either from the coordinates provided in the paper, or from the verbatim description of the field work locality. The georeferencing of the verbatim descriptions is made using Yandex or Google map services. Depending on the quality of georeference provided in publications, the uncertainty is estimated as follows: 1) the coordinate of a fruiting structure or a plot provided in the publication gives the uncertainty about 3-30 meters; 2) the coordinate of the field work locality provided in publication gives the uncertainty about 500 m to 5 km; 3) the report of the species presence in a particular region gives the centroid of the area with the uncertainty radius to include its borders. The locality names reported in Russian are translated to English and written in the «locality» field. Russian descriptions are reserved in the field «verbatimLocality» for accuracy. When possible, the «eventDate» is extracted from the annotation data. Whenever this information is absent, the date of the publication is used instead with the remarks in the «verbatimEventDate» field. The ecological features, habitat and substrate preferences are written in the «habitat» field and reserved in Russian. The original scientific names reported in publications are filled in the «originalNameUsage» field. Correction of spelling errors is made using the GBIF Species Matching tool. This tool is also used to create the additional fields of taxonomic hierarchy from species to kingdom, to fill in the «taxonRank» field and to synonymize according to the GBIF Backbone Taxonomy. To track the digitization process, a worksheet is maintained. Each bibliographic record has a series of fields to describe the digitization process and its results: the total number of extracted occurrence records, general description of the occurrence quality, presence of the observation date, details of georeferencing and the name of a person responsible for the digitization. The bibliography of related publications is compiled using Zotero bibliographic manager. Only published works (peer-reviewed papers, conference proceedings, PhD theses, monographs or book chapters) are selected. If possible, the sources are digitized and added to the library as PDF files. The template of the FuSWS database is made with Google Sheets, which allows simultaneous use by several specialists, in a common data format provided. The simple Microsoft Excel template is also available for the offline databasing. The Darwin Core standard is applied to the database field structure to accommodate the relevant information extracted from the publications. From the available bibliography of publications related to the region, only works with species occurrences are selected for the databasing purpose. The main source of occurrences is annotated species lists with exact localities of the records. However, different sorts of other species citations are also extracted, provided that they had the connection to any geography. All occurrences are georeferenced, either from the coordinates provided in the paper, or from the verbatim description of the field work locality. The georeferencing of the verbatim descriptions is made using Yandex or Google map services. Depending on the quality of georeference provided in publications, the uncertainty is estimated as follows: 1) the coordinate of a fruiting structure or a plot provided in the publication gives the uncertainty about 3-30 meters; 2) the coordinate of the field work locality provided in publication gives the uncertainty about 500 m to 5 km; 3) the report of the species presence in a particular region gives the centroid of the area with the uncertainty radius to include its borders. The locality names reported in Russian are translated to English and written in the «locality» field. Russian descriptions are reserved in the field «verbatimLocality» for accuracy. When possible, the «eventDate» is extracted from the annotation data. Whenever this information is absent, the date of the publication is used instead with the remarks in the «verbatimEventDate» field. The ecological features, habitat and substrate preferences are written in the «habitat» field and reserved in Russian. The original scientific names reported in publications are filled in the «originalNameUsage» field. Correction of spelling errors is made using the GBIF Species Matching tool. This tool is also used to create the additional fields of taxonomic hierarchy from species to kingdom, to fill in the «taxonRank» field and to synonymize according to the GBIF Backbone Taxonomy. To track the digitization process, a worksheet is maintained. Each bibliographic record has a series of fields to describe the digitization process and its results: the total number of extracted occurrence records, general description of the occurrence quality, presence of the observation date, details of georeferencing and the name of a person responsible for the digitization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.