Yao Hu scite author profile

Yao Hu

2Publications

0Citation Statements Received

35Citation Statements Given

How they've been cited

How they cite others

Affiliations

Keio University

Publications

Order By: Most citations

Accelerating Parallel Applications Based on Graph Reordering for Random Network Topologies

2023

IEEE Access

View full text Add to dashboard Cite

The Message Passing Interface (MPI) is a crucial programming tool for enabling communication between processes in parallel applications. The goal of MPI users is to allocate tasks to processors in a way that maximizes both spatial and temporal locality in the network. However, this can be challenging, especially in large-scale networks where maximizing processor locality may not be feasible at runtime. To address this issue, we propose the use of Hamorder, an offline node reassignment approach that takes into account physical processor locations based on graph reordering for Random network topologies. Hamorder aims to optimize task mapping for improved performance in parallel applications, whether for multiple tasks or within a single task. Additionally, we investigate the potential of improving MPI applications through runtime parameter tuning based on Hamorder. Our evaluation results show that Hamorder provides a 27.3% improvement in performance compared to the Gorder algorithm on Random topologies, which is a state-ofthe-art solution designed with the aim of enhancing cache locality and achieves this goal by rearranging the vertices of a graph in a way that places the vertices that are typically accessed together in close proximity. Moreover, our autotuning framework using Hamorder results in an average speedup of 1.38x for targeted MPI applications by searching through various runtime parameter combinations.

show abstract

Exploring Approximate Communication Using Lossy Bitwise Compression on Interconnection Networks

2023

IEEE Access

View full text Add to dashboard Cite

The use of approximate communication has emerged as a promising approach for enhancing the efficiency of communication in parallel computer systems. By sending incomplete or imprecise messages, approximate communication can significantly reduce communication time. In this study, we examine application-level techniques for approximate communication to enable high portability on high-performance interconnection networks. Specifically, we focus on lossy compression of floating-point data, which is frequently exchanged between compute nodes in parallel applications. Our approach involves a simple application scenario where a source process compresses a communication dataset and a destination process decompresses it in an MPI parallel program. We use two bitwise procedures for compression: lossy bitzip compression and lossless bit-mask compression. Our aim is to transmit the largest possible amount of approximate data with the least possible compression overhead. Additionally, we explore error check and correction techniques to ensure bit-flip fault tolerance for the compressed data during transmission. We implement our scheme in several communication-intensive MPI applications and demonstrate that our approximate communication approach effectively speeds up total execution time while staying within a specified quality-of-result error bound.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yao Hu

Accelerating Parallel Applications Based on Graph Reordering for Random Network Topologies

Exploring Approximate Communication Using Lossy Bitwise Compression on Interconnection Networks

Contact Info

Product

Resources

About