2003
DOI: 10.1109/tc.2003.1214336
|View full text |Cite
|
Sign up to set email alerts
|

Latency, occupancy, and bandwidth in dsm multiprocessors: a performance evaluation

Abstract: Abstract-While the desire to use commodity parts in the communication architecture of a DSM multiprocessor offers advantages in cost and design time, the impact on application performance is unclear. We study this performance impact through detailed simulation, analytical modeling, and experiments on a flexible DSM prototype, using a range of parallel applications. We adapt the logP model to characterize the communication architectures of DSM machines. The l (network latency) and o (controller occupancy) param… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
9
0

Year Published

2004
2004
2019
2019

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 34 publications
2
9
0
Order By: Relevance
“…The immediate successor of shared memory from multiprocessor world would have a page as the unit for data transfer. Granularity describes the size of the minimum unit of shared memory [13,14,15]. In the said DSM framework it is the page-size.…”
Section: Granularitymentioning
confidence: 99%
“…The immediate successor of shared memory from multiprocessor world would have a page as the unit for data transfer. Granularity describes the size of the minimum unit of shared memory [13,14,15]. In the said DSM framework it is the page-size.…”
Section: Granularitymentioning
confidence: 99%
“…If we continue to assume that network latency is the primary performance determinant, the time complexity of the release stage is O(1), because the N invalidation messages and subsequent N reload requests can be pipelined. However, researchers have reported that memory controller (MMC) occupancy has a greater impact on barrier performance than network latency for medium-sized DSM multiprocessors [6]. In other words, the assumption that coherence messages can be sent from or processed by a particular memory controller in negligible time does not hold.…”
Section: Time Complexity Analysismentioning
confidence: 99%
“…We accurately model the latency and cache effects of TLB misses. On two different occasions our processor model has been validated against real hardware [2], [8].…”
Section: Simulation Environmentmentioning
confidence: 99%