Site-based data curation based on hot spring geobiology

Palmer, Carole L.; Thomer, Andrea K.; Baker, Karen S.; Wickett, Karen M.; Hendrix, Christie; Rodman, Ann; Sigler, Stacey; Fouke, Bruce W.

doi:10.1371/journal.pone.0172090

Cited by 18 publications

(26 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The students reported finding this prospective discussion of data collection methods and best practices helpful in their work, and many were successful at producing robust metadata in their field books that would be key for curating and sharing their data. The spreadsheet template, as well as an excerpt of one student's completed template are available in our Supplemental Materials (https://doi.org/10.6084/m9.figshare.5450809); this work is also discussed further by Palmer et al ().…”

Section: Discussionmentioning

confidence: 99%

“…To that end, we developed a method of Research Process Modeling. This approach draws on systems analysis and information modeling approaches, and is informed by both our prior work on this project (Palmer et al, ), and prior work on computational process curation (Goble et al, ) and workflow‐centric research objects (for example, Bechhofer et al, ; Belhajjame et al, ). The simple inventory described above became one of four components required to document the artifacts, processes, and relationships involved in the collection of physical samples and observational data.…”

Section: Methodsmentioning

confidence: 99%

“…Additionally, the inventory identifies the Minimum Information Framework (MIF) superclass and the Formats for each data artifact. The “MIF superclasses” are drawn from our prior work developing a high‐level information model for geobiology field data (Palmer et al, ). Classifying data artifacts according to MIF classes is helpful in supporting reuse of data for new purposes, and may aid access and retrieval functions as data collections are brought together in repositories over time.…”

Section: Methodsmentioning

confidence: 99%

“…Through the SBDC project, we sought to support the aggregation and integration of geobiology data within and across scientifically significant sites. In collaboration with geobiologists and National Park Service (NPS) personnel, we developed a Minimum Information Framework of key information classes that ought to be prioritized for collection and curation (Palmer et al, ). We additionally used the approaches described herein to identify optimal points of curatorial intervention in the research workflow; these are points at which data should be optimally documented and managed, thereby making field‐based processes retraceable, and the data collected reliably interpretable and reusable.…”

Section: The Case: Geobiology At Yellowstone National Parkmentioning

confidence: 99%

See 3 more Smart Citations

Documenting provenance in noncomputational workflows: Research process models based on geobiology fieldwork in Yellowstone National Park

Thomer

Wickett

Baker

et al. 2018

Asso for Info Science & Tech

Self Cite

View full text Add to dashboard Cite

A comprehensive record of research data provenance is essential for the successful curation, management, and reuse of data over time. However, creating such detailed metadata can be onerous, and there are few structured methods for doing so. In this case study of data curation in support of geobiology research conducted at Yellowstone National Park, we describe a method of "Research Process Modeling" for documenting noncomputational data provenance in a structured yet flexible way. The method combines systems analysis techniques to model research activities, the World Wide Web Consortium Provenance (PROV) ontology to illustrate relationships between data products, and simple inventory methods to account for research processes and data products. It also supports collaborative data curation between information professionals and researchers, and is therefore a significant step toward producing more useable and interpretable research data. We demonstrate how this method describes data provenance more robustly than "flat" metadata alone and fills a critical gap in the documentation of provenance for field-based and noncomputational workflows. We discuss potential applications of this approach to other research domains.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

Section: The Case: Geobiology At Yellowstone National Parkmentioning

confidence: 99%

See 2 more Smart Citations

Documenting provenance in noncomputational workflows: Research process models based on geobiology fieldwork in Yellowstone National Park

Thomer

Wickett

Baker

et al. 2018

Asso for Info Science & Tech

Self Cite

View full text Add to dashboard Cite

show abstract

“…Detailed description of all protocols and techniques used for field collection, biomolecule extractions, and meta-omic analyses are presented in the Supplementary Information and briefly summarized here. Detailed descriptions of the experimental design and metadata curation strategies adopted for all aspects of the field and laboratory analyses in the present study are presented in the works of Palmer et al (2017) and Thomer et al (2018).…”

Section: Methodsmentioning

confidence: 99%

Physiology, Metabolism, and Fossilization of Hot-Spring Filamentous Microbial Mats

et al. 2019

Self Cite

View full text Add to dashboard Cite

The evolutionarily ancient Aquificales bacterium Sulfurihydrogenibium spp. dominates filamentous microbial mat communities in shallow, fast-flowing, and dysoxic hot-spring drainage systems around the world. In the present study, field observations of these fettuccini-like microbial mats at Mammoth Hot Springs in Yellowstone National Park are integrated with geology, geochemistry, hydrology, microscopy, and multi-omic molecular biology analyses. Strategic sampling of living filamentous mats along with the hot-spring CaCO3 (travertine) in which they are actively being entombed and fossilized has permitted the first direct linkage of Sulfurihydrogenibium spp. physiology and metabolism with the formation of distinct travertine streamer microbial biomarkers. Results indicate that, during chemoautotrophy and CO2 carbon fixation, the 87–98% Sulfurihydrogenibium-dominated mats utilize chaperons to facilitate enzyme stability and function. High-abundance transcripts and proteins for type IV pili and extracellular polymeric substances (EPSs) are consistent with their strong mucus-rich filaments tens of centimeters long that withstand hydrodynamic shear as they become encrusted by more than 5 mm of travertine per day. Their primary energy source is the oxidation of reduced sulfur (e.g., sulfide, sulfur, or thiosulfate) and the simultaneous uptake of extremely low concentrations of dissolved O2 facilitated by bd-type cytochromes. The formation of elevated travertine ridges permits the Sulfurihydrogenibium-dominated mats to create a shallow platform from which to access low levels of dissolved oxygen at the virtual exclusion of other microorganisms. These ridged travertine streamer microbial biomarkers are well preserved and create a robust fossil record of microbial physiological and metabolic activities in modern and ancient hot-spring ecosystems.

show abstract

Integrative data reuse at scientifically significant sites: Case studies at Yellowstone National Park and the La Brea Tar Pits

Thomer

2022

Asso for Info Science & Tech

Self Cite

View full text Add to dashboard Cite

Scientifically significant sites are the source of, and long-term repository for, considerable amounts of data-particularly in the natural sciences. However, the unique data practices of the researchers and resource managers at these sites have been relatively understudied. Through case studies of two scientifically significant sites (the hot springs at Yellowstone National Park and the fossil deposits at the La Brea Tar Pits), I developed rich descriptions of sitebased research and data curation, and high-level data models of information classes needed to support integrative data reuse. Each framework treats the geospatial site and its changing natural characteristics as a distinct class of information; more commonly considered information classes such as observational and sampling data, and project metadata, are defined in relation to the site itself. This work contributes (a) case studies of the values and data needs for researchers and resource managers at scientifically significant sites, (b) an information framework to support integrative reuse at these sites, and (c) a discussion of data practices at scientifically significant sites.

show abstract

Site-based data curation based on hot spring geobiology

Cited by 18 publications

References 25 publications

Documenting provenance in noncomputational workflows: Research process models based on geobiology fieldwork in Yellowstone National Park

Documenting provenance in noncomputational workflows: Research process models based on geobiology fieldwork in Yellowstone National Park

Physiology, Metabolism, and Fossilization of Hot-Spring Filamentous Microbial Mats

Integrative data reuse at scientifically significant sites: Case studies at Yellowstone National Park and the La Brea Tar Pits

Contact Info

Product

Resources

About