Feasibility study of MPI implementation on the heterogeneous multi-core cell BE™ architecture

Kumar, Arun; Jayam, Naresh; Srinivasan, Ashok; Senthilkumar, Ganapathy; Baruah, Pallav Kumar; Kapoor, Shakti; Krishna, Murali; Sarma, Raghunath

doi:10.1145/1248377.1248387

Cited by 11 publications

(4 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, it is vital to implement MPI efficiently on the Cell BE processor to leverage its tremendous computational power. For this, there has been a feasibility study of MPI implementation on the Cell BE [6,7]. In this study, a minimal set of a synchronous mode MPI on the Cell BE has been implemented, and the results show the potential of the Cell BE to run MPI applications efficiently.…”

Section: Motivation and Related Workmentioning

confidence: 99%

Extended characterization of DMA transfers on the Cell BE processor

Khunjush

Dimopoulos

2008

2008 IEEE International Symposium on Parallel and Distributed Processing

View full text Add to dashboard Cite

The main contributors to message delivery latency in message passing environments are the copying operations needed to transfer and bind a received message to the consuming process/thread. A significant portion of the software communication overhead is attributed to message copying. Recently, a set of factors has been leading highperformance processor architectures toward designs that feature multiple processing cores on a single chip (a.k.a. CMP). The Cell Broadband Engine (BE) shows potential to provide high-performance to parallel applications (e.g., MPI applications). The Cell's non-homogeneous architecture along with small local storage in SPEs impose restrictions and challenges for parallel applications. In this work, we first characterize various data delivery mechanisms in the Cell BE processor; then, we propose techniques to facilitate the delivery of a message in MPI environments implemented in the Cell BE processor. We envision a cluster system comprising several cell processors each supporting several computation threads.

show abstract

Section: Motivation and Related Workmentioning

confidence: 99%

Extended characterization of DMA transfers on the Cell BE processor

Khunjush

Dimopoulos

2008

2008 IEEE International Symposium on Parallel and Distributed Processing

View full text Add to dashboard Cite

show abstract

“…A recent publication by Bellens et al [4] describes CellSs, a new programming model for Cell-like architectures. Arun Kumar et al [5] provide a MPI implementation on Cell. Its memory access relies on software cache technology.…”

Section: Related Workmentioning

confidence: 99%

Study on Explicit Memory Management for CBEA Green Computing Architecture

Liu

Ming

Chen

et al. 2011

AMR

View full text Add to dashboard Cite

Heterogeneous multi-core processors are attractive for power efficient green computing because of their ability to meet varied resource requirements. The multi-level memory hierarchy of Cell Broadband Engine Architecture (CBEA) which requires explicit management by software poses significant challenges to performance increasing and programming. In this paper, with analysis of characteristic of the architecture, we implemented four access methods and a corresponding access library with a uniform memory access interface. Besides getting performance boosts beyond current level technology, the memory access library with uniform access interface could collect profile information of memory management for further performance optimization. Experimental results show the performance of proposed method is better than related works and profile information provided by the method is helpful for programmer to optimize application performance.

show abstract

“…Eichenberger et al [13] describe a series of compiler optimizations, while the proposal of software cache deals with memory access and considers the efficiency of DMA transfers. Arun Kumar et al [14] provide a MPI implementation on Cell with memory access relying on software-managed cache. But performance degradation is a major issue of current software-managed cache technologies.…”

Section: Related Workmentioning

confidence: 99%

A Profile-based Memory Access Optimizing Technology on CBE Architecture

Feng

Dong

et al. 2008

2008 10th IEEE International Conference on High Performance Computing and Communications

View full text Add to dashboard Cite

In the paper, we investigate the memory access technology on Cell Broadband Engine Architecture (CBEA), and develop a profiling infrastructure for memory management on the architecture. By registering the dynamic memory allocation and providing details of trace of memory access the infrastructure provides the data partition information automatically which alleviates the burdens of programmer and provides a safety guarantee for aggressive data prefetch for computing task. On the other hand, the profile information is useful for analyzing the patterns of memory access and helpful for further performance optimization. Experimental results show that applications implemented based on our SDK library not only support aggressive memory access method without the requirement of external data partition information, but also could be optimized aggressively under the guideline of the profile information provided by the proposed SDK library.

show abstract

Feasibility study of MPI implementation on the heterogeneous multi-core cell BE™ architecture

Cited by 11 publications

References 4 publications

Extended characterization of DMA transfers on the Cell BE processor

Extended characterization of DMA transfers on the Cell BE processor

Study on Explicit Memory Management for CBEA Green Computing Architecture

A Profile-based Memory Access Optimizing Technology on CBE Architecture

Contact Info

Product

Resources

About