System effects of interprocessor communication latency in multicomputers

Zhang, X.

doi:10.1109/40.76617

Search citation statements

Order By: Relevance

Paper Sections

Select...

Performance Implications Of Network Architectures1

Interprocessor Communication Overhead1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

1991

1997

Publication Types

Select...

Article5

Book2

Relationship

Self Cite1

Independent6

Authors

Journals

Cited by 24 publications

(2 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This example shows that bisection bandwidth is an important limitation in a multicomputer system. 12 The CM-5 data network is a modified 4D fat tree. A fat-tree network is a treelike structure in which bandwidth increases at each level nearer the root.…”

Section: Performance Implications Of Network Architecturesmentioning

confidence: 99%

Distributed edge detection: issues and implementations

Zhang

Dykes

Deng³

1997

IEEE Comput. Sci. Eng.

View full text Add to dashboard Cite

show abstract

Section: Performance Implications Of Network Architecturesmentioning

confidence: 99%

Distributed edge detection: issues and implementations

Zhang

Dykes

Deng³

1997

IEEE Comput. Sci. Eng.

View full text Add to dashboard Cite

show abstract

“…Table 1 gives the and of the GP1000 comparing with other ve types of distributed memory multicomputers (see e.g. [29]).…”

Section: Interprocessor Communication Overheadmentioning

confidence: 99%

Performance prediction and evaluation of parallel processing on a NUMA multiprocessor

Zhang

Qin

1991

IIEEE Trans. Software Eng.

Self Cite

View full text Add to dashboard Cite

Non-Uniform Memory Access (NUMA) architectures make it possible to build large-scale shared memory multiprocessor systems in comparison with non-scalable Uniform Memory Access (UMA) architectures. Most NUMA multiprocessor operations such as scheduling and synchronizing processes, accessing data from processors to memory models and allocating distributed memory space to dierent processors, are performed through interconnection networks such as a multistage switching network. The eciency of these basic operations determines the parallel processing performance on a NUMA multiprocessor. This paper presents several analytical models to predict and evaluate the overhead of interprocessor communication, process scheduling, process synchronization and remote memory access where network contention and memory contention are considered. Performance measurements to support the models and analyses through several numerical examples have been done on the BBN GP1000, a NUMA shared memory multiprocessor. Both analytical and experimental results give a comprehensive and clear understanding of the various eects, which are important for the eective use of a NUMA shared memory multiprocessor. The results in this paper may be used to determine optimal strategies in developing an ecient programming environment for a NUMA system.

show abstract

Support for multiple classes of traffic in multicomputer routers

Rexford

Shin

1994

Parallel Computer Routing and Communication

View full text Add to dashboard Cite

System effects of interprocessor communication latency in multicomputers

Cited by 24 publications

References 6 publications

Distributed edge detection: issues and implementations

Distributed edge detection: issues and implementations

Performance prediction and evaluation of parallel processing on a NUMA multiprocessor

Support for multiple classes of traffic in multicomputer routers

Contact Info

Product

Resources

About