General Atomics' (GA) scientists in the United States remotely conducted experimental operation of the experimental advanced superconducting tokamak (EAST) in China during its third shift. Scientists led these experiments in a dedicated remote control room that utilized a novel computer science hardware and software infrastructure to allow data movement, visualization, and communication on the time scale of EAST's pulse cycle. This Fusion Science Collaboration Zone infrastructure allows the movement of large amounts of data between continents in a short time scale with a 300-fold increase in data transfer rate over that available using the traditional transmission protocol. Real-time data from control systems is moved almost instantaneously. An event system tied to the EAST pulse cycle allows automatic initiation of data transfers, resulting in bulk EAST data to be transferred to GA within minutes. The EAST data at GA is served via MDSplus to approved US collaborators avoiding multiple US clients from requesting data from EAST and competing for the longhaul network's bandwidth. At present there are 37 approved scientists from 8 US research institutions.
Data from large-scale experiments and extreme-scale computing is expensive to produce and may be used for critical applications. However, it is not the mere existence of data that is important, but our ability to make use of it. Experience has shown that when metadata is better organized and more complete, the underlying data becomes more useful. Traditionally, capturing the steps of scientific workflows and metadata was the role of the lab notebook, but the digital era has resulted instead in the fragmentation of data, processing, and annotation. This paper presents the Metadata, Provenance, and Ontology (MPO) System, the software that can automate the documentation of scientific workflows and associated information. Based on recorded metadata, it provides explicit information about the relationships among the elements of workflows in notebook form augmented with directed acyclic graphs. A set of web-based graphical navigation tools and Application Programming Interface (API) have been created for searching and browsing, as well as programmatically accessing the workflows and data. We describe the MPO concepts and its software architecture. We also report the current status of the software as well as the initial deployment experience.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.