SUMMARYJob Provenance (JP), part of the gLite Grid middleware, is a service that keeps long-term trace on completed computations for further reference. It is a job-centric service, keeping records about job life cycle, its environment, inputs/outputs, user parameters, and annotations. During the first provenance challenge, we explored the relation between a specific job-centric Grid-oriented provenance and a more general data provenance approach. The challenge represents a use case which emphasizes fields that were not priorities in the original JP design. However, we proved that the design is sufficiently general to cope with this mode of use. We also identified several areas where it is feasible to extend the current implementation.
The Job Provenance (JP) service is designed to automate keeping track of computations on large scale Grids, giving thus users a tool to correctly archive information about their jobs and to re-submit any job in a reconstructed environment. JP provides a permanent minimal record of job (and its environment) related information, to which free-form user annotations can be added. JP also offers the capability of configuring any number of indexed logical views on the large collections of raw data, allowing efficient processing of even complex user queries selecting on both system data and the annotations. The scalable architecture, capable to handle millions of jobs in a single JP installation, and integrated into the EGEE gLite middleware environment is presented.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.