The study of biodiversity spans many disciplines and includes data pertaining to species distributions and abundances, genetic sequences, trait measurements, and ecological niches, complemented by information on collection and measurement protocols. A review of the current landscape of metadata standards and ontologies in biodiversity science suggests that existing standards such as the Darwin Core terminology are inadequate for describing biodiversity data in a semantically meaningful and computationally useful way. Existing ontologies, such as the Gene Ontology and others in the Open Biological and Biomedical Ontologies (OBO) Foundry library, provide a semantic structure but lack many of the necessary terms to describe biodiversity data in all its dimensions. In this paper, we describe the motivation for and ongoing development of a new Biological Collections Ontology, the Environment Ontology, and the Population and Community Ontology. These ontologies share the aim of improving data aggregation and integration across the biodiversity domain and can be used to describe physical samples and sampling processes (for example, collection, extraction, and preservation techniques), as well as biodiversity observations that involve no physical sampling. Together they encompass studies of: 1) individual organisms, including voucher specimens from ecological studies and museum specimens, 2) bulk or environmental samples (e.g., gut contents, soil, water) that include DNA, other molecules, and potentially many organisms, especially microbes, and 3) survey-based ecological observations. We discuss how these ontologies can be applied to biodiversity use cases that span genetic, organismal, and ecosystem levels of organization. We argue that if adopted as a standard and rigorously applied and enriched by the biodiversity community, these ontologies would significantly reduce barriers to data discovery, integration, and exchange among biodiversity resources and researchers.
To improve the suitability of the Darwin Core standard for the research and management of alien species, the standard needs to express the native status of organisms, how well established they are and how they came to occupy a location. To facilitate this, we propose: 1. To adopt a controlled vocabulary for the existing Darwin Core term dwc:establishmentMeans 2. To elevate the pathway term from the Invasive Species Pathways extension to become a new Darwin Core term dwc:pathway maintained as part of the Darwin Core standard 3. To adopt a new Darwin Core term dwc:degreeOfEstablishment with an associated controlled vocabulary These changes to the standard will allow users to clearly state whether an occurrence of a species is native to a location or not, how it got there (pathway), and to what extent the species has become a permanent feature of the location. By improving Darwin Core for capturing and sharing these data, we aim to improve the quality of occurrence and checklist data in general and to increase the number of potential uses of these data.
Darwin-SW (DSW) is an RDF vocabulary designed to complement the Biodiversity Information Standards (TDWG) Darwin Core Standard. DSW is based on a model derived from a community discussion about the relationships among the main Darwin Core classes. DSW creates a new class to accommodate an important aspect of its model that is not currently part of Darwin Core: a class of Tokens, which are forms of evidence. DSW uses Web Ontology Language (OWL) to make assertions about the classes in its model and to define object properties that are used to link instances of those classes. A goal in the creation of DSW was to facilitate consistent markup of biodiversity data so that RDF graphs created by different providers could be easily merged. Accordingly, DSW provides a mechanism for testing whether its terms are being used in a manner consistent with its model. Two transitive object properties enable the creation of simple SPARQL queries that can be used to discover new information about linked resources whose metadata are generated by different providers. The Organism class enables semantic linking of biodiversity resources to vocabularies outside of TDWG that deal with observations and ecological phenomena.
The Darwin Core vocabulary is widely used to transmit biodiversity data in the form of simple text files. In order to support expression of biodiversity data in the Resource Description Framework (RDF), a guide was created as a non-normative addition to the Darwin Core standard. This paper describes the major issues that were addressed in the creation of the guide, particularly problems related to adapting terms designed to have literal values for use with IRI references. By making it possible to express millions of existing records as RDF, the guide is an important step towards enabling the biodiversity informatics community to participate in broader Linked Data and Semantic Web efforts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.