A Batch Task Migration Approach for Decentralized Global Rescheduling

2019 International Conference on High Performance Computing &Amp; Simulation (HPCS)

Santana

Castro

et al. 2019

Self Cite

In this paper, we propose a Distributed Graph Model (DGM) and data structure to enable communicationaware heuristics in distributed load balancers (LBs). DGM is motivated by the desire to maintain and use information related to the affinity between tasks (their communication) in order to improve data locality while scheduling tasks in a distributed fashion to avoid the centralization overhead. Results show that DGM is able to achieve speedups of up to 50.4x with 40 virtual cores, when compared to a centralized graph representation with the same purpose. Additionally, we propose a proofof-concept distributed scheduler that uses DGM, named Edge Migration, and its implementation in the Charm++ parallel programming model. These results show that, although the communication analysis is much faster with DGM, it is still the most relevant overhead in distributed LBs. We also observe that Edge Migration has a decision time in the same order of magnitude as other communication-unaware decentralized algorithms. Thus, DGM can be used in communication-aware distributed LBs to improve load balancing decisions with a small impact in the overall LB performance.

Section: Resultsmentioning

confidence: 59%

Section: Efforts In Load Balancingmentioning

confidence: 99%

See 1 more Smart Citation

Distributed Memory Graph Representation for Load Balancing Data: Accelerating Data Structure Generation for Decentralized Scheduling

2019 International Conference on High Performance Computing &Amp; Simulation (HPCS)

Santana

Castro

et al. 2019

Self Cite

“…We may divide LB algorithms in two main categories: Global and Diffusive [10]. While global algorithms centralize system information to make scheduling decisions (most list scheduling approaches follow this methodology [11], [12]); diffusive algorithms will use local machine information to dissipate load among its neighbors or other machines in the system [13], [14], [15].…”

Section: Background a Load Balancingmentioning

confidence: 99%

Adaptive Load Balancing based on Machine Learning for Iterative Parallel Applications

Oikawa

2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)

Castro

et al. 2020

Self Cite

The performance of irregular scientific applications can be easily affected by an uneven distribution of work among the computing resources. In this context, Load Balancing (LB) stands as one of the most important solutions to improve resource utilization. However, choosing the bestperforming load balancing algorithm for a given application is not a trivial task. For instance, manually and statically choosing an LB algorithm does not work in situations where applications have a dynamic or unknown behavior. In this context, we propose a Machine Learning-based Adaptive Load Balancer (ADAPTIVELB) to automate the load balancing algorithm decision at run time. This approach monitors and collects information about the application dynamically, and according to the analyzed data, it makes a decision of invoking the most suitable LB algorithm. Our experiments show that ADAPTIVELB can select a good load balancing algorithm in most of the cases, leading to performance improvements over statically chosen LB algorithms and over the absence of a load balancer.

“…Our approach reduces scheduling overhead by applying the following design: (i) execute scheduling steps in a distributed fashion; (ii) perform workload discretization to simplify decisions; (iii) group local tasks to avoid numerous fine-grained migrations and (iv) minimize the messages between scheduling actors. The discretization technique, called packing, extends a previous technique [6] with notions related to -Nash Equilibrium. We also present PackStealLB , a new distributed load balancing algorithm that uses packing and inherits ideas related to constrained [7,8] and randomized [9] Work Stealing (WS) heuristics.…”

Section: Introductionmentioning

confidence: 99%

PackStealLB: A scalable distributed load balancer based on work stealing and workload discretization

Journal of Parallel and Distributed Computing

Pilla

Santana

et al. 2021

Self Cite

The scalability of high-performance, parallel iterative applications is directly affected by how well they use the available computing resources. These applications are subject to load imbalance due to the nature and dynamics of their computations. It is common that high performance systems employ periodic load balancing to tackle this issue. Dynamic load balancing algorithms redistribute the application's workload using heuristics to circumvent the NP-hard complexity of the problem However, scheduling heuristics must be fast to avoid hindering application performance when distributing the workload on large and distributed environments. In this work, we present a technique for low overhead, high quality scheduling decisions for parallel iterative applications. The technique relies on combined application workload information paired with distributed scheduling algorithms. An initial distributed step among scheduling agents group application tasks in packs of similar load to minimize messages among them. This information is used by our scheduling algorithm, Pack-StealLB, for its distributed-memory work stealing heuristic. Experimental results showed that PackStealLB is able to improve the performance of a molecular dynamics benchmark by up to 41%, outperforming other scheduling algorithms in most scenarios over almost one thousand cores.