Scientific workflows -models of computation that capture the orchestration of scientific codes to conduct "in silico" research -are gaining recognition as an attractive alternative to script-based orchestration. Despite growing interest, there are a number of fundamental challenges that researchers developing scientific workflow technologies must address, including developing the underlying "science" of scientific workflows. In this article, we present a broad classification of scientific workflow environments based on three major phases of in-silico research as well as highlight active research projects that illustrate this classification With our tripartite classification, based on the the phases of "in silico" research, scientists will be capable of making more informed decisions regarding the adoption of particular workflow environments.
We describe a reusable architecture and implementation framework for managing science processing pipelines for mission ground data systems. Our system, dubbed "PCS", for Process Control System, improves upon an existing software component, the OODT Catalog and Archive (CAS), which has already supported the QuikSCAT, SeaWinds and AMT earth science missions. This paper focuses on PCS within the context of two current earth science missions: the Orbiting Carbon Observatory (OCO), and NPP Sounder PEATE projects.
Data-intensive systems and applications transfer large volumes of data and metadata to highly distributed users separated by geographic distance and organizational boundaries. An influential element in these large volume data transfers is the selection of the appropriate software connector that satisfies user constraints on the required data distribution scenarios. Currently, this task is typically accomplished by consulting "gurus", who rely on their intuitions, at best backed by anecdotal evidence. In this paper we present a systematic approach for selecting software connectors based on eight key dimensions of data distribution that we use to represent the data distribution scenarios. Our approach, dubbed DISCO, has been implemented as a Java-based framework. The early experience with DISCO indicates good accuracy and scalability.
While software architectures have been shown to aid developers in maintenance, reuse, and evolution as well as many other software engineering tasks, there is little language-level support for these architectural concepts in legacy programming languages such as Fortran and C. Because many existing scientific codes are written in legacy programming languages, it is difficult to integrate them into architected software systems. By wrapping these scientific codes in architecturally-aware Java interfaces, we are able to componentize legacy programs, integrating them into systems built with first-class architectural elements while meeting the performance and throughput requirements of scientific codes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.