Steve Timm scite author profile

Steve Timm

5Publications

78Citation Statements Received

50Citation Statements Given

How they've been cited

How they cite others

Affiliations

Fermilab

Publications

Order By: Most citations

Interoperation of world‐wide production e‐Science infrastructures

Riedel¹,

Laure²,

Soddemann³

et al. 2009

Concurrency and Computation

View full text Add to dashboard Cite

SUMMARYMany production Grid and e-Science infrastructures have begun to offer services to end-users during the past several years with an increasing number of scientific applications that require access to a wide variety of resources and services in multiple Grids. Therefore, the Grid Interoperation Now-Community Group of the Open Grid Forum-organizes and manages interoperation efforts among those production Grid infrastructures to reach the goal of a world-wide Grid vision on a technical level in the near future. This contribution highlights fundamental approaches of the group and discusses open standards in the context of production e-Science infrastructures.

show abstract

HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

Holzman¹,

Girone²,

Hufnagel³

et al. 2017

Comput Softw Big Sci

View full text Add to dashboard Cite

OverviewThe use of highly distributed systems for high-throughput computing has been very successful for the broad scientific computing community. Programs such as the Open Science Grid [1] allow scientists to gain efficiency by utilizing available cycles across different domains. Traditionally, these programs have aggregated resources owned at different institutes, adding the important functionality to elastically contract and expand resources to match instantaneous demand as desired. An appealing scenario is to extend the reach of extensible resources to the rental market of commercial clouds.A prototypical example of such a scientific domain is the field of High Energy Physics (HEP), which is strongly dependent on high-throughput computing. Every stage of a modern HEP experiment requires massive resources (compute, storage, networking). Detector and simulationgenerated data have to be processed and associated with auxiliary detector and beam information to generate physics objects, which are then stored and made available to the experimenters for analysis. In the current computing paradigm, the facilities that provide the necessary resources utilize distributed high-throughput computing, with global workflow, scheduling, and data management, enabled by high-performance networks. The computing resources in these facilities are either owned by an experiment and operated by laboratories and university partners (e.g. Energy Frontier experiments at the Large Hadron Collider (LHC) such as CMS, ATLAS) or deployed for a specific program, owned and operated by the host laboratory (e.g. Intensity Frontier experiments at Fermilab such as NOvA, MicroBooNE).The HEP investment to deploy and operate these resources is significant: for example, at the time of this work, Abstract Historically, high energy physics computing has been performed on large purpose-built computing systems. These began as single-site compute facilities, but have evolved into the distributed computing grids used today. Recently, there has been an exponential increase in the capacity and capability of commercial clouds. Cloud resources are highly virtualized and intended to be able to be flexibly deployed for a variety of computing tasks. There is a growing interest among the cloud providers to demonstrate the capability to perform large-scale scientific computing. In this paper, we discuss results from the CMS experiment using the Fermilab HEPCloud facility, which utilized both local Fermilab resources and virtual machines in the Amazon Web Services Elastic Compute Cloud. We discuss the planning, technical challenges, and lessons learned involved in performing physics workflows on a large-scale set of virtualized resources. In addition, we will discuss the economics and operational efficiencies when executing workflows both in the cloud and on dedicated resources.

show abstract

Data Processing Model for the CDF Experiment

Antos̆

Babik

Benjamin

et al. 2006

IEEE Trans. Nucl. Sci.

View full text Add to dashboard Cite

Abstract-The data processing model for the CDF experiment is described. Data processing reconstructs events from parallel data streams taken with different combinations of physics event triggers and further splits the events into datasets of specialized physics datasets. The design of the processing control system faces strict requirements on bookkeeping records, which trace the status of data files and event contents during processing and storage. The computing architecture was updated to meet the mass data flow of the Run II data collection, recently upgraded to a maximum rate of 40 MByte/sec. The data processing facility consists of a large cluster of Linux computers with data movement managed by the CDF data handling system to a multi-petaByte Enstore tape library. The latest processing cycle has achieved a stable speed of 35 MByte/sec (3 TByte/day). It can be readily scaled by increasing CPU and data-handling capacity as required.

show abstract

Virtual machine provisioning, code management, and data movement design for the Fermilab HEPCloud Facility

Timm

Cooper

Fuess

et al. 2017

J. Phys.: Conf. Ser.

View full text Add to dashboard Cite

Investigation of storage options for scientific computing on Grid and Cloud facilities

Garzoglio

Chadwick²,

Hesselroth³

et al. 2011

View full text Add to dashboard Cite

In recent years, several new storage technologies, such as Lustre, Hadoop, and BlueArc, have emerged. While several groups have run benchmarks to characterize them under a variety of configurations, more work is needed to evaluate these technologies for the use cases of scientific computing on Grid clusters and Cloud facilities. This paper discusses our evaluation of the technologies as deployed on a test bed at FermiCloud, one of the Fermilab infrastructure-as-aservice Cloud facilities. The test bed consists of 4 server-class nodes with 40 TB of disk space and up to 50 virtual machine clients, some running on the storage server nodes themselves. With this configuration, the evaluation compares the performance of some of these technologies when deployed on virtual machines and on "bare metal". In addition to running standard benchmarks such as IOZone to check the sanity of our installation, we have run I/O intensive tests using experiment specific applications. This paper presents how the storage solutions perform in a variety of realistic use cases of scientific computing.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Steve Timm

Interoperation of world‐wide production e‐Science infrastructures

HEPCloud, a New Paradigm for HEP Facilities: CMS Amazon Web Services Investigation

Data Processing Model for the CDF Experiment

Virtual machine provisioning, code management, and data movement design for the Fermilab HEPCloud Facility

Investigation of storage options for scientific computing on Grid and Cloud facilities

Contact Info

Product

Resources

About