Although the industry has embraced the cloud computing model, there are still significant challenges to be addressed concerning the quality of cloud services. Network-intensive applications may not scale in the cloud due to the sharing of the network infrastructure. In the literature, performance evaluation studies are showing that the network tends to limit the scalability and performance of HPC applications. Therefore, we proposed the aggregation of Network Interface Cards (NICs) in a ready-touse integration with the OpenNebula cloud manager using Linux containers. We perform a set of experiments using a network microbenchmark to get specific network performance metrics and NAS parallel benchmarks to analyze the performance impact on HPC applications. Our results highlight that the implementation of NIC aggregation improves network performance in terms of throughput and latency. Moreover, HPC applications have different patterns of behavior when using our approach, which depends on communication and the amount of data transferring. While network-intensive applications increased the performance up to 38%, other applications with aggregated NICs maintained the same performance or presented slightly worse performance.
O desempenho das aplicações HPC depende de dois componentes principais; poder de processamento e interconexão de rede. Este artigo avalia o impacto que a interconexão de rede exerce em programas paralelos usando um cluster homogêneo, em relação a desempenho e custo de execução estimado.
Historically, large computational clusters have supported hardware requirements for executing High-Performance Computing (HPC) applications. This model has become out of date due to the high costs of maintaining and updating these infrastructures. Currently, computing resources are delivered as a service because of the cloud computing paradigm. In this way, we witnessed consistent efforts to migrate HPC applications to the cloud. However, if on the one hand cloud computing offers an attractive environment for HPC, benefiting from the pay-per-use model and on-demand resource allocation, on the other, there are still significant performance challenges to be addressed, such as the known network bottleneck. In this article, we evaluate the use of a Network Interface Cards (NIC) aggregation approach, using the IEEE 802.3ad standard to improve the performance of representative HPC applications executed in LXD container based-cloud. We assessed the aggregation impact using two and four NICs with three distinct transmission hash policies. Our results demonstrated that if the correct hash policy is selected, the NIC aggregation can significantly improve the performance of network-intensive HPC applications by up to
40%.
O desempenho de aplicações paralelas depende de dois componentes principais do ambiente; o poder de processamento e a interconexão de rede. Neste trabalho, foi avaliado o impacto de uma interconexão de alto desempenho em programas paralelos em um cluster homogêneo de servidores interconectados por Gigabit Ethernet 1 Gbps e InfiniBand FDR 56 Gbps. Foi realizada uma caracterização do NAS Parallel Benchmarks em relação à computação, comunicação e custo de execução em instâncias da Microsoft Azure. Os resultados mostraram que, em aplicações altamente dependentes de rede, o desempenho pode ser significativamente melhorado ao utilizar InfiniBand a um custo de execução melhor, mesmo com o preço superior da instância.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.