Summary
In this paper, we investigate several design choices for HPC services at different layers of the cloud computing architecture to simplify and broaden its use cases. We start with the platform‐as‐a‐service (PaaS) layer and compare direct and iterative parallel linear equation solvers. We observe that several matrix properties that can be identified before starting long‐running solvers can help HPC services automatically select the amount of computing resources per job, such that the job latency is minimized and the overall job throughput is maximized. As a proof of concept, we use classical problems in structural mechanics and mesh these problems with increasing granularities leading to various matrix sizes, ie, largest having 1 billion non‐zero elements. In addition to matrix size, we take into account matrix condition numbers, preconditioning effects, and solver types and execute these finite element analysis (FEA) over an IBM HPC cluster. Next, we focus on the infrastructure‐as‐a‐service (IaaS) layer and explore HPC application performance, load isolation, and deployment issues using application containers (Docker) while also comparing them to physical and virtual machines (VM) over a public cloud.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.