Optimal scheduling of in-situ analysis for large-scale scientific simulations

Malakar, Preeti; Vishwanath, Venkatram; Munson, Todd; Christopher, Knight; Hereld, Mark; Leyffer, Sven; Papka, Michael E.

doi:10.1145/2807591.2807656

Cited by 34 publications

(17 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Zheng et al (2013b) also propose several heuristics to compute process to core mappings and optimize the use of helper cores and staging nodes. Malakar et al (2015; 2016) considered in situ analysis as a numerical optimization problem to compute an optimal frequency of analytics subject to resource constraints such as I/O bandwidth and available memory. However, their work is limited to sequential simulation and analysis.…”

Section: Related Workmentioning

confidence: 99%

Modeling high-throughput applications for in situ analytics

Aupy

Goglin

Honoré

et al. 2019

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

With the goal of performing exascale computing, the importance of input/output (I/O) management becomes more and more critical to maintain system performance. While the computing capacities of machines are getting higher, the I/O capabilities of systems do not increase as fast. We are able to generate more data but unable to manage them efficiently due to variability of I/O performance. Limiting the requests to the parallel file system (PFS) becomes necessary. To address this issue, new strategies are being developed such as online in situ analysis. The idea is to overcome the limitations of basic postmortem data analysis where the data have to be stored on PFS first and processed later. There are several software solutions that allow users to specifically dedicate nodes for analysis of data and distribute the computation tasks over different sets of nodes. Thus far, they rely on a manual resource partitioning and allocation by the user of tasks (simulations, analysis). In this work, we propose a memory-constraint modelization for in situ analysis. We use this model to provide different scheduling policies to determine both the number of resources that should be dedicated to analysis functions and that schedule efficiently these functions. We evaluate them and show the importance of considering memory constraints in the model. Finally, we discuss the different challenges that have to be addressed to build automatic tools for in situ analytics.

show abstract

Section: Related Workmentioning

confidence: 99%

Modeling high-throughput applications for in situ analytics

Aupy

Goglin

Honoré

et al. 2019

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

show abstract

“…As stated in Section 1, the goal is to maximize a weighted throughput, since analysis applications may be required at different rates, from every simulation step to every tenth (or more) step [13]. We let β i denote the weight of application A i for 1 ≤ i ≤ m. Intuitively, β i represents the number of times that we should execute application A i at each iteration step.…”

Section: Optimization Problemmentioning

confidence: 99%

“…In the simplest case, each application will have to complete within the time of a simulation step, hence we need to achieve the same throughput for each application, and maximize that value. In other situations, some applications may be needed only every k simulation steps, with a different value of k per application [13]. This calls for achieving a weighted throughput per application, and for maximizing the minimum value of these weighted throughputs, which dictates the global rate at which the analysis can progress.…”

Section: Introductionmentioning

confidence: 99%

Co-Scheduling HPC Workloads on Cache-Partitioned CMP Platforms

Aupy

Benoît

Goglin

et al. 2018

2018 IEEE International Conference on Cluster Computing (CLUSTER)

View full text Add to dashboard Cite

With the recent advent of many-core architectures such as chip multiprocessors (CMP), the number of processing units accessing a global shared memory is constantly increasing. Co-scheduling techniques are used to improve application throughput on such architectures, but sharing resources often generates critical interferences. In this paper, we focus on the interferences in the last level of cache (LLC) and use the Cache Allocation Technology (CAT) recently provided by Intel to partition the LLC and give each co-scheduled application their own cache area. We consider m iterative HPC applications running concurrently and answer to the following questions: (i) how to precisely model the behavior of these applications on the cache partitioned platform? and (ii) how many cores and cache fractions should be assigned to each application to maximize the platform efficiency? Here, platform efficiency is defined as maximizing the performance either globally, or as guaranteeing a fixed ratio of iterations per second for each application. Through extensive experiments using CAT, we demonstrate the impact of cache partitioning when multiple HPC application are co-scheduled onto CMP platforms.

show abstract

“…Depending on the simulation resource requirements and available analysis resources available, there exists a tradeoff between in situ and in transit analysis. A significant future challenge is to better orchestrate and schedule in situ analyses together with the simulation while taking into account the time and memory requirements of the analyses, the importance of the analyses, and the system parameters such as the computation time, I/O bandwidth, and maximum available memory to decide the optimal frequencies of the in situ analyses [MVM*15].…”

Section: In Situ Applicationsmentioning

confidence: 99%

In Situ Methods, Infrastructures, and Applications on High Performance Computing Platforms

Bauer

Abbasi

Ahrens

et al. 2016

Computer Graphics Forum

143

View full text Add to dashboard Cite

The considerable interest in the high performance computing (HPC) community regarding analyzing and visualization data without first writing to disk, i.e., in situ processing, is due to several factors. First is an I/O cost savings, where data is analyzed/visualized while being generated, without first storing to a filesystem. Second is the potential for increased accuracy, where fine temporal sampling of transient analysis might expose some complex behavior missed in coarse temporal sampling. Third is the ability to use all available resources, CPU's and accelerators, in the computation of analysis products. This STAR paper brings together researchers, developers and practitioners using in situ methods in extreme-scale HPC with the goal to present existing methods, infrastructures, and a range of computational science and engineering applications using in situ analysis and visualization.

show abstract

Optimal scheduling of in-situ analysis for large-scale scientific simulations

Cited by 34 publications

References 32 publications

Modeling high-throughput applications for in situ analytics

Modeling high-throughput applications for in situ analytics

Co-Scheduling HPC Workloads on Cache-Partitioned CMP Platforms

In Situ Methods, Infrastructures, and Applications on High Performance Computing Platforms

Contact Info

Product

Resources

About