Data on marine biota exist in many formats and sources, such as published literature, data repositories, and unpublished material. Due to this heterogeneity, information is difficult to find, access and combine, severely impeding its reuse for further scientific analysis and its long-term availability for future generations. To address this challenge, we present CRITTERBASE, a publicly accessible data warehouse and interactive portal that currently hosts quality-controlled and taxonomically standardized presence/absence, abundance, and biomass data for 18,644 samples and 3,664 benthic taxa (2,824 of which at species level). These samples were collected by grabs, underwater imaging or trawls in Arctic, North Sea and Antarctic regions between the years 1800 and 2014. Data were collated from literature, unpublished data, own research and online repositories. All metadata and links to primary sources are included. We envision CRITTERBASE becoming a valuable and continuously expanding tool for a wide range of usages, such as studies of spatio-temporal biodiversity patterns, impacts and risks of climate change or the evidence-based design of marine protection policies.
Abstract.Over the last two decades, the Alfred Wegener Institute (AWI) has been continuously committing to develop and sustain an e-Infrastructure for coherent discovery, visualization, dissemination and archival of scientific information in polar and marine regions. Most of the data originates from research activities being carried out in a wide range of AWI-operated research platforms: vessels, land-based stations, ocean-based stations and aircrafts. Archival and publishing in PANGAEA repository along with DOI assignment to individual datasets is a typical end-of-line step for most data owners.Within AWI, a workflow for data acquisition from vessel-mounted devices along with ingestion procedures for the raw data into the institutional archives has been well-established for many years. However, the increasing number of ocean-based stations and respective sensors along with heterogeneous project-driven requirements towards satellite communication, sensor monitoring, QA/QC control and validation, processing algorithms, visualization and dissemination has recently lead us to build a more generic and cost-effective framework. This framework, hereafter named O2A, has as main strength its seamless flow of sensor observation to archives and the fact that it complies with internationally used OGC standards and thus assuring interoperability in international context (e.g. SOS/SWE, WPS, WMS WFS,..).O2A is comprised of several extensible and exchangeable modules (e.g. controlled vocabularies and gazetteers, file type and structure validation, aggregation solutions, processing algorithms, etc) as well as various interoperability services. At the first data tier level, not only each sensor is being described following SensorML data model standards but the data is being fed to an SOS interface offering streaming solutions along with support to O&M encoding. Project administrators or data specialists are now able to monitor the individual sensors displayed in a map by simply clicking on the station and viewing the near real-time data for the selected station and sensor. In addition, the monitoring dashboards we built provide assistance to data scientists and administrators in terms of early detection of malfunction of sensors (e.g., email/SMS notification), filtering of data values for certain range (e.g. temperature values above a certain range) and data aggregation (e.g. calculation of daily averages).
The information system PANGAEA provides targeted support for research data management as well as long-term data archiving and publication. PANGAEA is operated as an open access library for archiving, publishing, and distributing georeferenced data from earth and environmental sciences. It focuses on observational and experimental data. Citability, comprehensive metadata descriptions, interoperability of data and metadata, a high degree of structural and semantic harmonization of the data inventory as well as the commitment of the hosting institutions ensures the long-term usability of archived data. PANGAEA is a pioneer of FAIR and open data infrastructures to enable data intensive science and an integral component of national and international science and technology activities. This paper provides an overview of the recent organisational, structural, and technological advancements in developing and operating the information system.
<p>The O2A (Observation to Archive) is a data-flow framework for heterogeneous sources, including multiple institutions and scales of Earth observation. In the O2A, once data transmission is set up, processes are executed to automatically ingest (i.e. collect and harmonize) and quality control data in near real-time. We consider a web-based sensor description application to support transmission and harmonization of observational time-series data. We also consider a product-oriented quality control, where a standardized and scalable approach should integrate the diversity of sensors connected to the framework. A review of literature and observation networks of marine and terrestrial environments is under construction to allow us, for example, to characterize quality tests in use for generic and specific applications. In addition, we use a standardized quality flag scheme to support both user and technical levels of information. In our outlook, a quality score should pair the quality flag to indicate the overall plausibility of each individual data value or to measure the flagging uncertainty. In this work, we present concepts under development and give insights into the data ingest and quality control currently operating within the O2A framework.</p>
<p>Today's fast digital growth made data the most essential tool for scientific progress in Earth Systems Science. Hence, we strive to assemble a modular research infrastructure comprising a collection of tools and services that allow researchers to turn big data into scientific outcomes.</p><p>Major roadblocks are (i) the increasing number and complexity of research platforms, devices, and sensors, (ii) the heterogeneous project-driven requirements towards, e. g., satellite data, sensor monitoring, quality assessment and control, processing, analysis and visualization, and (iii) the demand for near real time analyses.</p><p>These requirements have led us to build a generic and cost-effective framework <strong>O2A</strong> (<strong>O</strong>bservation <strong>to</strong> <strong>A</strong>rchive) to enable, control, and access the flow of sensor observations to archives and repositories.</p><p>By establishing O2A within major cooperative projects like <strong>MOSES</strong> and <strong>Digital Earth</strong> in the research field Earth and Environment of the German Helmholtz Association, we extend research data management services, computing powers, and skills to connect with the evolving software and storage services for data science. This fully supports the typical scientific workflow from its very beginning to its very end, that is, from data acquisition to final data publication.&#160;</p><p>The key modules of O2A's digital research infrastructure established by AWI to enable Digital Earth Science are implementing the <strong>FAIR</strong> principles:</p><ul><li><strong>Sensor Web</strong>, to register sensor applications and capture controlled meta data before and alongside any measurement in the field</li> <li><strong>Data ingest</strong>, allowing researchers to feed data into storage systems and processing pipelines in a prepared and documented way, at best in controlled NRT data streams</li> <li><strong>Dashboards, </strong>allowing researchers to find and access data and share and collaborate among partners</li> <li><strong>Workspace, </strong>enabling researchers to access and use data with research software in a cloud-based virtualized infrastructure that allows researchers to analyse massive amounts of data on the spot</li> <li><strong>Archiving </strong>and<strong> publishing data </strong>via repositories and Digital Object Identifiers (DOI).</li> </ul>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.