Abstract:Understanding the behavior of parallel applications that use the Message Passing Interface (MPI) is critical for optimizing communication performance. Performance tools for MPI currently rely on the PMPI Profiling Interface or the MPI Tools Information Interface, MPI_T, for portably collecting information for performance measurement and analysis. While tools using these interfaces have proven to be extremely valuable for performance tuning, these interfaces only provide synchronous information, i.e., when an M… Show more
“…Our approach proposes a set of events triggered by MPI and captured by ATaP runtime systems. To be consistent with the MPI standard, we implement our techniques on top of existing solutions like MPI T, the MPI Tool Information interface introduced in MPI 3.0 [11], as well as the recently proposed MPI T Events extensions [13]. The latter provides the necessary infrastructure for callbacks in MPI, intended for the support of tracing tools, but does not define any concrete events matching the philosophy of MPI T. In particular, we propose adding the following events to MPI: • MPI COLLECTIVE PARTIAL INCOMING signals the arrival of some data in the context of a collective communication.…”
Section: Extending Mpi To Support Event Handlingmentioning
confidence: 99%
“…In particular, by having the events described in Section 3.1 handled by callbacks, we release the ATaP runtime system from the need for polling the event queue. For this functionality, we directly rely on the MPI T Events proposal [13], which provides generic callbacks mainly intended to implement tracing tools. We use it to track the events described in Section 3.1 and notify the ATaP runtime, which can then associate a handler function by invok- ing the MPI T Event handle alloc call, as described by Hermanns et al [13].…”
Section: Callback-based Notificationmentioning
confidence: 99%
“…For this functionality, we directly rely on the MPI T Events proposal [13], which provides generic callbacks mainly intended to implement tracing tools. We use it to track the events described in Section 3.1 and notify the ATaP runtime, which can then associate a handler function by invok- ing the MPI T Event handle alloc call, as described by Hermanns et al [13]. Once invoked, the runtime receives an MPI events object, which can be decoded with MPI T Event read.…”
Section: Callback-based Notificationmentioning
confidence: 99%
“…We present two mechanisms, similar to the MPI tools interface (MPI T) [13], for exchanging information between MPI and an ATaP runtime system and analyze their trade-offs: 1. a fast mechanism to poll events when idle using a lock-free queue, and 2. a delivery solution based on callbacks that can benefit from a hardware implementation, shown in the bottom row of Figure 1. These mechanisms allow ATaP runtimes to seamlessly interoperate with MPI by reducing or completely eliminating the need for explicit polling or waiting on specific requests, and instead deliberately invoking the progress engine only when needed, driven by runtime events.…”
Asynchronous task-based programming models are gaining popularity to address the programmability and performance challenges in high performance computing. One of the main attractions of these models and runtimes is their potential to automatically expose and exploit overlap of computation with communication. However, we find that inefficient interactions between these programming models and the underlying messaging layer (in most cases, MPI) limit the achievable computationcommunication overlap and negatively impact the performance of parallel programs. We address this challenge by exposing and exploiting information about MPI internals in a task-based runtime system to make better task-creation and scheduling decisions. In particular, we present two mechanisms for exchanging information between MPI and a task-based runtime, and analyze their trade-offs. Further, we present a detailed evaluation of the proposed mechanisms implemented in MPI and a taskbased runtime. We show performance improvements of up to 16.3% and 34.5% for proxy applications with point-to-point and collective communication, respectively.
“…Our approach proposes a set of events triggered by MPI and captured by ATaP runtime systems. To be consistent with the MPI standard, we implement our techniques on top of existing solutions like MPI T, the MPI Tool Information interface introduced in MPI 3.0 [11], as well as the recently proposed MPI T Events extensions [13]. The latter provides the necessary infrastructure for callbacks in MPI, intended for the support of tracing tools, but does not define any concrete events matching the philosophy of MPI T. In particular, we propose adding the following events to MPI: • MPI COLLECTIVE PARTIAL INCOMING signals the arrival of some data in the context of a collective communication.…”
Section: Extending Mpi To Support Event Handlingmentioning
confidence: 99%
“…In particular, by having the events described in Section 3.1 handled by callbacks, we release the ATaP runtime system from the need for polling the event queue. For this functionality, we directly rely on the MPI T Events proposal [13], which provides generic callbacks mainly intended to implement tracing tools. We use it to track the events described in Section 3.1 and notify the ATaP runtime, which can then associate a handler function by invok- ing the MPI T Event handle alloc call, as described by Hermanns et al [13].…”
Section: Callback-based Notificationmentioning
confidence: 99%
“…For this functionality, we directly rely on the MPI T Events proposal [13], which provides generic callbacks mainly intended to implement tracing tools. We use it to track the events described in Section 3.1 and notify the ATaP runtime, which can then associate a handler function by invok- ing the MPI T Event handle alloc call, as described by Hermanns et al [13]. Once invoked, the runtime receives an MPI events object, which can be decoded with MPI T Event read.…”
Section: Callback-based Notificationmentioning
confidence: 99%
“…We present two mechanisms, similar to the MPI tools interface (MPI T) [13], for exchanging information between MPI and an ATaP runtime system and analyze their trade-offs: 1. a fast mechanism to poll events when idle using a lock-free queue, and 2. a delivery solution based on callbacks that can benefit from a hardware implementation, shown in the bottom row of Figure 1. These mechanisms allow ATaP runtimes to seamlessly interoperate with MPI by reducing or completely eliminating the need for explicit polling or waiting on specific requests, and instead deliberately invoking the progress engine only when needed, driven by runtime events.…”
Asynchronous task-based programming models are gaining popularity to address the programmability and performance challenges in high performance computing. One of the main attractions of these models and runtimes is their potential to automatically expose and exploit overlap of computation with communication. However, we find that inefficient interactions between these programming models and the underlying messaging layer (in most cases, MPI) limit the achievable computationcommunication overlap and negatively impact the performance of parallel programs. We address this challenge by exposing and exploiting information about MPI internals in a task-based runtime system to make better task-creation and scheduling decisions. In particular, we present two mechanisms for exchanging information between MPI and a task-based runtime, and analyze their trade-offs. Further, we present a detailed evaluation of the proposed mechanisms implemented in MPI and a taskbased runtime. We show performance improvements of up to 16.3% and 34.5% for proxy applications with point-to-point and collective communication, respectively.
“…Our approach proposes a set of events triggered by MPI and captured by ATaP runtime systems. To be consistent with the MPI standard, we implement our techniques on top of existing solutions like MPI T, the MPI Tool Information interface introduced in MPI 3.0 [11], as well as the recently proposed MPI T Events extensions [13]. The latter provides the necessary infrastructure for callbacks in MPI, intended for the support of tracing tools, but does not define any concrete events matching the philosophy of MPI T. In particular, we propose adding the following events to MPI:…”
Section: Extending Mpi To Support Event Handlingmentioning
Asynchronous task-based programming models are gaining popularity to address the programmability and performance challenges in high performance computing. One of the main attractions of these models and runtimes is their potential to automatically expose and exploit overlap of computation with communication. However, we find that inefficient interactions between these programming models and the underlying messaging layer (in most cases, MPI) limit the achievable computationcommunication overlap and negatively impact the performance of parallel programs. We address this challenge by exposing and exploiting information about MPI internals in a task-based runtime system to make better task-creation and scheduling decisions. In particular, we present two mechanisms for exchanging information between MPI and a task-based runtime, and analyze their trade-offs. Further, we present a detailed evaluation of the proposed mechanisms implemented in MPI and a taskbased runtime. We show performance improvements of up to 16.3% and 34.5% for proxy applications with point-to-point and collective communication, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.