Using machine learning to optimize graph execution on NUMA machines

Rocha, Hiago Mayk G. de A.; Schwarzrock, Janaína; Lorenzon, Arthur F.; Beck, Antonio Carlos Schneider

doi:10.1145/3489517.3530581

Cited by 6 publications

(4 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Traditionally, HPC servers receive batches of graphs applications to execute serially (i.e., one after another), always using all the available processing resources ‐ in the same way as any other HPC application. However, graph applications, because of their irregular structure and larger dimension, are intrinsically communication/memory‐bound, which means that they tend to use the CPU less when compared to the average parallel application 1,10 . Given that, graph applications executions are even more affected by issues that tend to limit the scalability of parallel applications related to both hardware (e.g., saturation of execution units and communication bus) and software (e.g., data synchronization and concurrent accesses to shared memory) 11‐13 .…”

Section: Introductionmentioning

confidence: 99%

“…The rising number of cores and available memory in high-performance computing (HPC) servers has enabled the analysis of a massive amount of graph-structured data extracted from sources like Google, Facebook, and Twitter. [1][2][3] Simultaneously, the unprecedented growth of such interconnected data has been pushing forward the development of efficient graph analytics methods to extract useful information from massive data sources, making enhancements in several areas, such as business, geolocation, fraud detection, and social network analysis. 4,5 Graph algorithms such as PageRank and single source shortest-paths (SSSP) make it possible to perform operations like the page ordering of the most visited content on Google or even complex computations in the Artificial Intelligence ambit.…”

Section: Introductionmentioning

confidence: 99%

“…However, graph applications, because of their irregular structure and larger dimension, are intrinsically communication/memory-bound, which means that they tend to use the CPU less when compared to the average parallel application. 1,10 Given that, graph applications executions are even more affected by issues that tend to limit the scalability of parallel applications related to both hardware (e.g., saturation of execution units and communication bus) and software (e.g., data synchronization and concurrent accesses to shared memory). [11][12][13] This added to the fact that HPC multiprocessors have their memory hierarchy getting more complex, may indicate that executing all graph applications with the maximum number of threads, which is the default behavior, may not be efficient.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Improving the efficiency of graph algorithm executions on high‐performance computing

Moori

Rocha

Schwarzrock

et al. 2022

Concurrency and Computation

View full text Add to dashboard Cite

The growing need for extracting information from large graphs has been pushing the development of parallel graph algorithms. However, the highly irregular structure of the real-world graphs limits the performance and energy improvements of graph applications. In this paper, we show that, in most cases, using all the available cores of the multiprocessor is not the best option in terms of the aforementioned non-functional requirements. Based on that, we propose GraphKat, a framework that enables the simultaneous processing of several algorithms/graphs instead of executing them serially (i.e., one after another), increasing efficiency in terms of performance and energy.GraphKat works in two steps: (i) it characterizes the graph applications with a specific number of threads based on their efficiency levels; and (ii) it defines the execution order of all graph applications in the target system. Experimental results on three multicore processors (Intel and AMD) show that GraphKat improves the overall system's efficiency related to performance (up to 434.26×) and energy-saving (up to 245.21×), and reduces the graph applications' execution time (up to 17.70×) and energy consumption (up to 6.64×) compared to the default execution of parallel applications on HPC systems.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Improving the efficiency of graph algorithm executions on high‐performance computing

Moori

Rocha

Schwarzrock

et al. 2022

Concurrency and Computation

View full text Add to dashboard Cite

show abstract

“…These approaches, however, provide little flexibility as they require knowing the running applications before the execution starts and do not allow adapting the number of spawned threads dynamically. PredG [67] uses machine learning to select the best thread and data mapping policies to run graph applications on a NUMA system. Both approaches require knowing application-level information, such as the input graphs for decision-making.…”

Section: Coresmentioning

confidence: 99%

Interference Analysis and Resource Management in Server Processors: from HPC to Cloud Computing

Pons Escat

View full text Add to dashboard Cite

Finalmente, quiero agradecer a las personas más importantes de mi vida, mi familia. A mis padres, gracias por inculcarme el valor del esfuerzo y enseñarme a sobreponerme a las adversidades. Pero sobretodo, por apoyarme en cada paso que doy y animarme en los momentos más difíciles. Vuestra sonrisa de orgullo es la mejor recompensa. A mis abuelos, por su amor infinito que siempre me da fuerzas. A mis hermanos mayores, con los que puedo contar siempre. A Julio y Miljana, por su apoyo y cariño, y al pequeño Julio que ha llenado la familia de felicidad. A Alfonso, por ser mi fiel compañero y estar siempre a mi lado. Y también a mi novio, Mario, por sacarme una sonrisa cuando más lo necesito. Gracias por motivarme a ser mejor cada día y perseguir mis sueños.

show abstract

Allok: a machine learning approach for efficient graph execution on CPU–GPU clusters

Moori,

Rocha,

Lorenzon

et al. 2024

J Supercomput

View full text Add to dashboard Cite

Using machine learning to optimize graph execution on NUMA machines

Cited by 6 publications

References 14 publications

Improving the efficiency of graph algorithm executions on high‐performance computing

Improving the efficiency of graph algorithm executions on high‐performance computing

Interference Analysis and Resource Management in Server Processors: from HPC to Cloud Computing

Allok: a machine learning approach for efficient graph execution on CPU–GPU clusters

Contact Info

Product

Resources

About