Zhengming Yi scite author profile

Zhengming Yi

5Publications

7Citation Statements Received

142Citation Statements Given

How they've been cited

How they cite others

144

142

Affiliations

National University of Defense Technology

Publications

Order By: Most citations

A barrier optimization framework for NUMA multi‐core system

Chen

Yao

2019

Concurrency and Computation

View full text Add to dashboard Cite

Parallel program performance often critically depends on barrier performance. In modern NUMA multi-core machines, barrier synchronization performance is significantly affected by cache-coherence communication between cores, especially when the scale of NUMA systems is large, complex interconnected networks, memory hierarchies, and cache-coherence protocols make optimization of barrier algorithm hard.We propose a general barrier optimization framework on NUMA multi-core machines. The framework splits the barrier into three stages: the barrier arrival within a NUMA node, the barrier arrival across the NUMA nodes, and the wakeup, providing an opportunity to optimize the communication pattern and the cache-line placement in each stage. To reduce remote communication traffic, we introduce a coordinator per NUMA node. In addition, we implement two barrier algorithms based on the framework. Finally, we show the superiority of the barrier algorithms within our framework over other barrier algorithms and show how to translate a barrier algorithm into a performance model to help make an optimal tradeoff design. Experiments were conducted on three NUMA multi-core platforms and the results show that the barrier algorithm optimized within our framework is sufficient to deliver as good or better performance than state-of-art approaches on NUMA multi-core machines.

show abstract

A scalable lock on NUMA multicore

Yao

2020

Concurrency and Computation

View full text Add to dashboard Cite

Modern NUMA multicore architectures exhibit complicated memory behavior, such as cache coherence invalidation and nonuniform memory access where the access from a core to its local memory is significantly faster than crossnode access to memory on a different NUMA node. The complicated memory behavior has a large impact on the efficiency of locking synchronization, which affects the performance of parallel applications. Prior works offer several efficient designs to improve locking performance such as delegation schemes. However, the existing delegation schemes either occupy computing cores or provide nonscalable performance, or offer less portability. In this work, we present a NUMA-aware delegation lock that occupies no cores while offering scalable performance under high contention for NUMA multicore machines. The new lock is a variant of an efficient FFWD lock, and inherits its performance features, such as buffering responses within a NUMA node to minimize cache coherence traffic. Unlike FFWD, the new lock employs hierarchical NUMA-aware memory allocation and NUMA-aware dynamic server thread technique, to reduce crossnode communication between client and server threads. Our evaluation shows that the new lock outperforms FFWD under high contention, achieving the significant performance gains when compared with other state-of-the-art locks.

show abstract

A stealing mechanism for delegation methods

Yi¹,

Yao

2021

J Supercomput

View full text Add to dashboard Cite

FTSD

Yao

Chen

2021

View full text Add to dashboard Cite

A Universal Construction to implement Concurrent Data Structure for NUMA-muticore

Yao

Chen

2021

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Zhengming Yi

A barrier optimization framework for NUMA multi‐core system

A scalable lock on NUMA multicore

A stealing mechanism for delegation methods

FTSD

A Universal Construction to implement Concurrent Data Structure for NUMA-muticore

Contact Info

Product

Resources

About