2022
DOI: 10.1109/tpds.2021.3107775
|View full text |Cite
|
Sign up to set email alerts
|

LB4OMP: A Dynamic Load Balancing Library for Multithreaded Applications

Abstract: Exascale computing systems will exhibit high degrees of hierarchical parallelism, with thousands of computing nodes and hundreds of cores per node. Efficiently exploiting hierarchical parallelism is challenging due to load imbalance that arises at multiple levels. OpenMP is the most widely-used standard for expressing and exploiting the ever-increasing node-level parallelism. The scheduling options in OpenMP are insufficient to address the load imbalance that arises during the execution of multithreaded applic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
12
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 8 publications
(13 citation statements)
references
References 38 publications
(59 reference statements)
1
12
0
Order By: Relevance
“…To further aid in moving and distributing VMs, GEP estimates the VMH load and uses this information to determine which VMs should be assigned to each of the available VMAs. An open-source dynamic load-balancing library is introduced in this study, and it incorporates effective literature-based scheduling techniques [5]. LB4OMP is a research framework that encourages and supports research programming for the benefit of multi-threaded applications.…”
Section: Related Workmentioning
confidence: 99%
“…To further aid in moving and distributing VMs, GEP estimates the VMH load and uses this information to determine which VMs should be assigned to each of the available VMAs. An open-source dynamic load-balancing library is introduced in this study, and it incorporates effective literature-based scheduling techniques [5]. LB4OMP is a research framework that encourages and supports research programming for the benefit of multi-threaded applications.…”
Section: Related Workmentioning
confidence: 99%
“…Numerous loop scheduling algorithms were implemented in various OpenMP runtime libraries and made publicly available for OpenMP users. For instance, LB4OMP [1], an extended version of the LLVM OpenMP runtime library 2 (RTL), supports various dynamic loop scheduling (DLS) algorithms, including fixed size chunking (FSC) [20], factoring (FAC) [21], the practical variant of factoring (FAC2), tapering (TAP) [22], the practical variant of weighted factoring (WF2) [23], BOLD [24], adaptive weighted factoring (AWF), its variants (AWF-B,C,D,E) [25], adaptive factoring (AF) [26], mFAC and mAF (versions of FAC and AF with less overhead). Such a variety of many scheduling algorithms may bring users to face decision paralysis [11].…”
Section: Related Workmentioning
confidence: 99%
“…It has been shown that these three scheduling algorithms in OpenMP are insufficient for efficient scheduling of OpenMP parallel loops and that other scheduling algorithms deliver higher performance gains [1], [7], [8], [9]. However, those performance gains are achievable only via a careful (and expert) selection of scheduling algorithm and chunk parameter, on a per-loop, per-time-step, per-application, and per-system basis.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations