Anais Do XXIII Simpósio Em Sistemas Computacionais De Alto Desempenho (SSCAD 2022) 2022
DOI: 10.5753/wscad.2022.226528
|View full text |Cite
|
Sign up to set email alerts
|

HPC@Cloud: A Provider-Agnostic Software Framework for Enabling HPC in Public Cloud Platforms

Abstract: The cloud computing paradigm democratized compute infrastructure access to millions of resource-strained organizations, applying economics of scale to massively reduce infrastructure costs. In the High Performance Computing (HPC) context, the benefits of using public cloud resources make it an attractive alternative to expensive on-premises clusters, however there are several challenges and limitations. In this paper, we present HPC@Cloud: a provideragnostic software framework that comprises a set of key softw… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…A ferramenta HPC@Cloud [Munhoz and Castro 2022] é um software de código aberto 1 que está sendo desenvolvido pelo Laboratório de Pesquisa em Sistemas Distribuídos (La-PeSD) da Universidade Federal de Santa Catarina (UFSC) para facilitar a construc ¸ão de clusters para HPC em nuvens computacionais públicas. Atualmente, a ferramenta suporta a execuc ¸ão de aplicac ¸ões paralelas implementadas em OpenMP e MPI 2 nos provedores Amazon Web Services (AWS) e Vultr Cloud, incluindo a capacidade de construir clusters do tipo spot na AWS [Munhoz and Castro 2022]. Além da interface, a ferramenta também disponibiliza imagens e snapshots de máquinas virtuais previamente construídas para acelerar a construc ¸ão da infraestrutura, reduzindo-se assim os custos.…”
Section: Perfilamento Na Ferramenta Hpc@cloudunclassified
See 1 more Smart Citation
“…A ferramenta HPC@Cloud [Munhoz and Castro 2022] é um software de código aberto 1 que está sendo desenvolvido pelo Laboratório de Pesquisa em Sistemas Distribuídos (La-PeSD) da Universidade Federal de Santa Catarina (UFSC) para facilitar a construc ¸ão de clusters para HPC em nuvens computacionais públicas. Atualmente, a ferramenta suporta a execuc ¸ão de aplicac ¸ões paralelas implementadas em OpenMP e MPI 2 nos provedores Amazon Web Services (AWS) e Vultr Cloud, incluindo a capacidade de construir clusters do tipo spot na AWS [Munhoz and Castro 2022]. Além da interface, a ferramenta também disponibiliza imagens e snapshots de máquinas virtuais previamente construídas para acelerar a construc ¸ão da infraestrutura, reduzindo-se assim os custos.…”
Section: Perfilamento Na Ferramenta Hpc@cloudunclassified
“…As nuvens computacionais podem ser uma opc ¸ão atrativa também para pesquisadores da área de Computac ¸ão de Alto Desempenho (High Performance Computing -HPC), pois permite que grupos de pesquisa menores, com acesso limitado a clusters de HPC, possam fazer experimentos em larga escala de forma ágil e com custo muito inferior ao necessário para compra, gestão e manutenc ¸ão de infraestruturas físicas onpremise [Saini and Sainis 2021]. Com intuito de facilitar a construc ¸ão e configurac ¸ão de clusters de HPC em nuvens computacionais de forma agnóstica de provedor, foi desenvolvida a ferramenta open-source denominada HPC@Cloud [Munhoz and Castro 2022]. A ferramenta automatiza todo o processo de alocac ¸ão e configurac ¸ão da infraestrutura, permitindo que os pesquisadores possam executar suas aplicac ¸ões de HPC na nuvem sem se preocuparem com questões técnicas que envolvem a criac ¸ão do ambiente.…”
Section: Introduc ¸ãOunclassified
“…This collaboration, or the convincing of one community to engage with the other, is arguably more challenging than development work. When approaching those on the HPC side, discussions of cloud adoption focus on performance, costs 11 , 12 and security. 13 Approaching from the cloud side, there can be a lack of understanding of high performance computing technologies, and how they differ from or might potentially improve microservices.…”
Section: Introductionmentioning
confidence: 99%
“…It features a flexible command line interface that allows researchers to easily specify the required computing resources and configurations when building the HPC infrastructure on public cloud providers. MPI applications can run seamlessly on the provisioned HPC cluster using TCP/IP or InfiniBand (when available) for communications; We integrate into HPC@Cloud all blocking fault tolerance strategies tailored to tightly‐coupled Message Passing Interface (MPI) applications running on AWS spot instances proposed in Reference 6; We propose a new adaptive fault tolerance strategy that does not stop the MPI application when AWS revokes spot instances, thus providing better performance than blocking fault tolerance strategies proposed in Reference 6; We introduce support for Singularity containers, allowing more portability when migrating legacy HPC code to the cloud, and evaluate their performance overheads; We propose an empirical approach for estimating job execution costs on public clouds based on the execution of the target application with a small input problem size and gathered execution metrics, then later used to train a regression model for predicting total execution time; and We provide a practical cost‐efficiency analysis of executing a set of NAS Parallel Benchmarks (NPB) applications 7 and a parallel solver implementation for Laplace's heat diffusion equations based on a popular alternative Finite Difference Method (FDM) technique called Forward‐Time Central Space (FTCS) method 5,8 on two popular public cloud platforms (AWS and Vultr Cloud) across multiple cluster configurations, giving an in‐depth analysis of the advantages, main performance bottlenecks, costs, and pitfalls of using public cloud resources for HPC.…”
Section: Introductionmentioning
confidence: 99%
“… A preliminary shorter version of this work was published in Anais do XXIII Simpósio em Sistemas Computacionais de Alto Desempenho (WSCAD 2022) 5 …”
mentioning
confidence: 99%