A Data Driven Scheduling Approach for Power Management on HPC Systems

Wallace, Sean; Xu, Yang; Vishwanath, Venkatram; Allcock, William; Coghlan, Susan; Papka, Michael E.; Lan, Zhiling

doi:10.1109/sc.2016.55

Cited by 26 publications

(14 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The reason for training every 15 minutes is that the queued-job prediction model predicts the power waveform by selecting a job that is similar to the past submitted script. It is desirable to update the model at as short an interval as possible, because jobs executed at a close time to the current job often have similar features to those of the current jobs [10]. We evaluated the interval dependency of the relative error.…”

Section: Queued-job Prediction Modelmentioning

confidence: 99%

“…Typical HPC system utilization is below 100%. Moreover, job power per node is strongly dependent on each user's application, so the instantaneous power utilization of the systems is below 50% of the maximum system power capacity [10]. As the power prediction accuracy becomes worse, the power reduction with predictive control degrades due to ensuring a margin.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Power Prediction for Sustainable HPC

Suzuki

Hiraoka

Shiraishi

et al. 2021

Journal of Information Processing

View full text Add to dashboard Cite

Exascale computers consume huge amounts of power and their variation over time makes system energy management important. Because of time lag in cooling-units operation, predictive control is desirable for effective power control. In this work, we report a state-of-the-art power prediction model. Conventional methods with topic model use the power of past job as a prediction based on the similarity of job information. The prediction, however, fails, if there is no correct data before. To resolve this, we developed a recurrent neural network model with variable network size, which detects features of power shape from its power history and enables precise prediction during job execution. By integrating these models into a single algorithm, the optimal model is automatically adopted for prediction according to the job status. We demonstrated high-precision prediction with an average relative error of 5.7% in K computer as compared to that of 20.1% by the conventional method.

show abstract

Section: Queued-job Prediction Modelmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Power Prediction for Sustainable HPC

Suzuki

Hiraoka

Shiraishi

et al. 2021

Journal of Information Processing

View full text Add to dashboard Cite

show abstract

“…Power-performance optimization methodologies typically aim at maximizing application performance while satisfying given constraints (power budget, energy consumption, application execution deadline, etc.). Most of them are mainly based on "power-shifting" among hardware components [9,13,19] or among applications/jobs [3,17,20,24,26]. Usually we have a large number of jobs running on a HPC system, and hence optimizing power-performance in both intraand inter-application cases is important.…”

Section: Power-performance Optimizationmentioning

confidence: 99%

“…Chasapis et al proposed a runtime optimization method which changes concurrency levels and socket assignment considering manufacturing variability of chips and the relationships among chips in NUMA nodes [4]. Wallace et al proposed "data-driven" job scheduling strategy [26] which observes the power profile of each job and use it at runtime to decide power budget distribution among jobs running on the system which has limited power budget.…”

Section: Power-performance Optimizationmentioning

confidence: 99%

A Power Management Framework with Simple DSL for Automatic Power-Performance Optimization on Power-Constrained HPC Systems

Wada

Cao

et al. 2018

Supercomputing Frontiers

View full text Add to dashboard Cite

To design exascale HPC systems, power limitation is one of the most crucial and unavoidable issues; and it is also necessary to optimize the power-performance of user applications while keeping the power consumption of the HPC system below a given power budget. For this kind of power-performance optimization for HPC applications, it is indispensable to have enough information and good understanding about both the system specifications (what kind of hardware resources are included in the system, which component can be used as a "power-knob", how to control the power-knob, etc.) and user applications (which part of the application is CPU-intensive, memory-intensive, and so on). Because this situation forces both the users and administrators of power-constrained HPC systems pay much effort and cost, it has been highly demanded to realize a simple framework to automate a power-performance optimization process, and to provide a simple user interface to the framework. To tackle these concerns, we propose and implement a versatile framework to help carry out power management and performance optimization on power-constrained HPC systems. In this framework, we also propose a simple DSL as an interface to utilize the framework. We believe this is a key to effectively utilize HPC systems under the limited power budget.

show abstract

“…Other works try to assign the application to the node where the data is mapped, or at least keep it as close as possible . More advanced solutions try to reduce gradually the job completion time by tuning the initial task allocation, adjusting data locality dynamically based on the status of the system and the network or to reduce the system power consumption, guiding the scheduling decisions . A detailed overview of data‐aware scheduling can be found in the work of Caíno‐Lores and Carretero .…”

Section: Introductionmentioning

confidence: 99%

On the effects of allocation strategies for exascale computing systems with distributed storage and unified interconnects

Pascual

Lant

Concatto

et al. 2018

Concurrency and Computation

View full text Add to dashboard Cite

The convergence between computing-and data-centric workloads and platforms is imposing new challenges on how to best use the resources of modern computing systems. In this paper, we investigate alternatives for the storage subsystem of a novel exascale-capable system with special emphasis on how allocation strategies would affect the overall performance. We consider several aspects of data-aware allocation such as the effect of spatial and temporal locality, the affinity of data to storage sources, and the network-level traffic prioritization for different types of flows.In our experimental set-up, temporal locality can have a substantial effect on application runtime (up to a 10% reduction), whereas spatial locality can be even more significant (up to one order of magnitude faster with perfect locality). The use of structured access patterns to the data and the allocation of bandwidth at the network level can also have a significant impact (up to 20% and 17% reduction of runtime, respectively). These results suggest that scheduling policies exposing data-locality information can be essential for the appropriate utilization of future large-scale systems. Finally, we found that the distributed storage system we are implementing can outperform traditional SAN architectures, even with a much smaller (in terms of I/O servers) back-end. KEYWORDSinter-processor communications, near-data computing, resource allocation, scheduling, storage traffic INTRODUCTIONTraditionally, supercomputers have been used to execute large computing-intensive parallel applications such as scientific codes. However, nowadays, new types of data-oriented applications are becoming increasingly popular. In contrast with traditional high performance computing (HPC) codes, they have to process massive amounts of scientific or business-oriented data and, hence, impose completely different needs to the computing systems.Indeed, new hardware and software are being developed to suit these necessities. One of these systems is our novel, custom-made architecture,ExaNeSt. 1 We are working on the design and construction of a prototype capable of reaching Exascale computation using tens of millions of interconnected low-power-consumption ARM cores. 2 To support such kind of data-intensive applications, we are leveraging a unified, low-latency Interconnection Network (hereafter, IN) and a fully distributed storage subsystem, BeeGFS, with the data spread across the nodes in local non-volatile storage devices 3 (NVM). This greatly contrasts with traditional supercomputers and datacenters that rely on Storage Area Networks (SAN) to access the data with separate networks for I/O, system management, and inter-processor communications (IPC).A fully distributed file system allows for near-data computation, reducing the great overheads of moving data from the centralized storage to the compute nodes. 4 A single, consolidated IN offers enormous power-savings when compared with multi-network designs. While a unified IN does, indeed, allow us to cope with power and co...

show abstract

A Data Driven Scheduling Approach for Power Management on HPC Systems

Cited by 26 publications

References 13 publications

Power Prediction for Sustainable HPC

Power Prediction for Sustainable HPC

A Power Management Framework with Simple DSL for Automatic Power-Performance Optimization on Power-Constrained HPC Systems

On the effects of allocation strategies for exascale computing systems with distributed storage and unified interconnects

Contact Info

Product

Resources

About