Danilo Carastan-Santos scite author profile

High-Performance Computing (HPC) platforms are growing in size and complexity. In order to improve the quality of service of such platforms, researchers are devoting a great amount of effort to devise algorithms and techniques to improve different aspects of performance such as energy consumption, total usage of the platform, and fairness between users. In spite of this, system administrators are always reluctant to deploy state of the art scheduling methods and most of them revert to EASY-backfilling, also known as EASY-FCFS (EASY-First-Come-First-Served). Newer methods frequently are complex and obscure and the simplicity and transparency of EASY are too important to sacrifice. In this work, we used execution logs from five HPC platforms to compare four simple scheduling policies: FCFS, Shortest estimated Processing time First (SPF), Smallest Requested Resources First (SQF), and Smallest estimated Area First (SAF). Using simulations, we performed a thorough analysis of the cumulative results for up to 180 weeks and considered three scheduling objectives: waiting time, slowdown and per-processor slowdown. We also evaluated other effects, such as the relationship between job size and slowdown, the distribution of slowdown values, and the number of backfilled jobs, for each HPC platform and scheduling policy. We conclude that one can only gain by replacing EASYbackfilling with SAF with backfilling, as it offers improvements in performance by up to 80% in the slowdown metric while maintaining the simplicity and the transparency of FCFS. Moreover, SAF reduces the number of jobs with large slowdowns and the inclusion of a simple thresholding mechanism guarantees that no starvation occurs. Finally, we propose SAF as a new benchmark for future scheduling studies.

show abstract

Finding exact hitting set solutions for systems biology applications using heterogeneous GPU clusters

Carastan-Santos

Camargo

Martins

et al. 2017

Future Generation Computer Systems

View full text Add to dashboard Cite

A Multi-GPU Hitting Set Algorithm for GRNs Inference

Carastan-Santos

Camargo

Martins-Jr

et al. 2015

View full text Add to dashboard Cite

Real-Time Scheduling Policy Selection from Queue and Machine States

Sant'Ana

Carastan-Santos

Cordeiro³

et al. 2019

View full text Add to dashboard Cite

Investigating memory prefetcher performance over parallel applications: From real to simulated

Girelli

Moreira

Serpa

et al. 2021

Concurrency and Computation

View full text Add to dashboard Cite

Memory prefetcher algorithms are widely used in processors to mitigate the performance gap between the processors and the memory subsystem. The complexities behind the architectures and prefetcher algorithms, however, not only hinder the development of accurate architecture simulators, but also hinder understanding the prefetcher's contribution to performance, on both a real hardware and in a simulated environment. In this paper, we contribute to shed light on the memory prefetcher's role in the performance of parallel High‐Performance Computing applications, considering the prefetcher algorithms offered by both the real hardware and the simulators. We performed a careful experimental investigation, executing the NAS parallel benchmark (NPB) on a real Skylake machine, and as well in a simulated environment with the ZSim and Sniper simulators, taking into account the prefetcher algorithms offered by both Skylake and the simulators. Our experimental results show that: (i) prefetching from the L3 to L2 cache presents better performance gains, (ii) the memory contention in the parallel execution constrains the prefetcher's effect, (iii) Skylake's parallel memory contention is poorly simulated by ZSim and Sniper, and (iv) Skylake's noninclusive L3 cache hinders the accurate simulation of NPB with the Sniper's prefetchers.

show abstract

A hybrid CPU‐GPU‐MIC algorithm for minimal hitting set enumeration

Carastan-Santos

Martins-Jr

Song

et al. 2018

Concurrency and Computation

View full text Add to dashboard Cite

Summary We present a hybrid exact algorithm for the Minimal Hitting Set (MHS) Enumeration Problem for highly heterogeneous CPU‐GPU‐MIC platforms. With several techniques that permit an efficient exploitation of each architecture, low communication cost, and effective load balancing, we were able to enumerate MHSs for large instances in reasonable time, achieving good performance and scalability. We obtained speedups of up to 25.32 in comparison with using two six‐core CPUs and we also enumerated MHSs for instances with tens of thousands of variables in less than 5 hours. We also evaluated our algorithm with a real‐world driven dataset, and with a large CPU‐GPU cluster, we unprecedentedly enumerated in parallel large minimal hitting sets of this dataset in less than 8 hours. These results reinforce the statement that heterogeneous clusters of CPUs, GPUs, and MICs can be used efficiently for high‐performance computing.

show abstract

Otimização de Aplicações Paralelas em Aceleradores Vetoriais NEC SX-Aurora

Michels¹,

Serpa²,

Carastan-Santos³

et al. 2020

View full text Add to dashboard Cite

Aceleradores vetoriais, por conta do modo que foram projetados, favorecem a execução de um mesmo conjunto de instruções sobre múltiplos dados, aumentando o desempenho de aplicações reais, como previsão do tempo e prospecção de petróleo. Neste trabalho, avaliamos o desempenho de aplicações paralelas executadas na arquitetura NEC SX-Aurora. Para tanto, foram utilizados como estudo de caso, o benchmark NAS e uma aplicação real de migração sísmica, utilizada pela indústria de petróleo e gás. Resultados experimentais mostraram que as técnicas de otimização loop unrolling e inlining, melhoraram o desempenho do benchmark NAS em até 7, 8× e da aplicação real de migração sísmica em até 1, 9×, em comparação com o desempenho das versões originais.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.