2019
DOI: 10.1051/epjconf/201921403060
|View full text |Cite
|
Sign up to set email alerts
|

HEPCloud, an Elastic Hybrid HEP Facility using an Intelligent Decision Support System

Abstract: HEPCloud is rapidly becoming the primary system for provisioning compute resources for all Fermilab-affiliated experiments. In order to reliably meet the peak demands of the next generation of High Energy Physics experiments, Fermilab must plan to elastically expand its computational capabilities to cover the forecasted need. Commercial cloud and allocation-based High Performance Computing (HPC) resources both have explicit and implicit costs that must be considered when deciding when to provision these resour… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 8 publications
0
8
0
Order By: Relevance
“…Figure 13.1 illustrates the entire chain, including interaction with storage elements within the job. The architecture can also provision resources on high-performance computing (HPC) resources, such as Cori at the NERSC, within the HEPCloud [141] infrastructure. The submission mechanism is unchanged whether the jobs are High Throughput Computing (HTC) or HPC; this seamless transition is key to efficiently utilizing available resources and also saves the job submitter significant effort by not requiring customized submission infrastructure for different resource types.…”
Section: Production Operations Management Systemmentioning
confidence: 99%
“…Figure 13.1 illustrates the entire chain, including interaction with storage elements within the job. The architecture can also provision resources on high-performance computing (HPC) resources, such as Cori at the NERSC, within the HEPCloud [141] infrastructure. The submission mechanism is unchanged whether the jobs are High Throughput Computing (HTC) or HPC; this seamless transition is key to efficiently utilizing available resources and also saves the job submitter significant effort by not requiring customized submission infrastructure for different resource types.…”
Section: Production Operations Management Systemmentioning
confidence: 99%
“…FACILE is particularly well-suited for an online computing application because the algorithm it replaces is responsible for 15% of the HLT latency per event. In this study, the clients are deployed as jobs running single-thread HLT instances on virtual machines in Google Cloud using the HEPCloud framework [40][41][42]. HEPCloud deploys jobs submitted on batch systems to CPU instances created dynamically at a cloud computing site.…”
Section: Online Computingmentioning
confidence: 99%
“…Figure 1 illustrates the entire chain, including interaction with storage elements within the job. The architecture can also provision resources on High-Performance Computing (HPC) resources, such as Cori at the National Energy Research supercomputing Center (NERSC), within the HEP-Cloud [7,8] infrastructure. The submission mechanism is unchanged whether the jobs are HTC or HPC; this seamless transition is key to efficiently utilizing available resources and also saves the job submitter significant effort by not requiring customized submission infrastructure for different resource types.…”
Section: Current Job Submission Infrastructurementioning
confidence: 99%