1991
DOI: 10.1016/0743-7315(91)90014-z
|View full text |Cite
|
Sign up to set email alerts
|

Tolerating latency through software-controlled prefetching in shared-memory multiprocessors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
216
0

Year Published

1992
1992
2011
2011

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 276 publications
(216 citation statements)
references
References 4 publications
0
216
0
Order By: Relevance
“…Mowry proposed a compiler algorithm for selective prefetching in uniprocessor and multiprocessor systems [10]. Simulation analysis based on DASH-like CC-NUMA multiprocessors showed considerable improvements, between 6% and 53% reduction in execution time.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Mowry proposed a compiler algorithm for selective prefetching in uniprocessor and multiprocessor systems [10]. Simulation analysis based on DASH-like CC-NUMA multiprocessors showed considerable improvements, between 6% and 53% reduction in execution time.…”
Section: Related Workmentioning
confidence: 99%
“…In softwarecontrolled cache prefetching, a processor executes a special Pf instruction, which initiates a non-blocking fetch operation that brings a data block, expected to be used by that processor, into its cache [10]. Ideally, the data block arrives at the cache before it is needed by the processor, and its load instruction results in a cache hit (Figure 1b, processor P2).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In his comprehensive work on software data prefetching, Mowry [22] explores the effect on execution time of varying the number of outstanding prefetch requests that can be handled simultaneously by the hardware. The author also compares two versions of the prefetch issue buffer hardware -one in which the processor stalls when the buffer is full, and one where additional requests are simply dropped.…”
Section: Related Workmentioning
confidence: 99%
“…The concept of overlapping computation with I/O, network, and other long latency operations is an old concept. Prefetching techniques [15][16][17] and thread speculation [1,18,19] also use such overlapping concept. Most previous work on prefetching also focused on moving data (mostly contiguous data) from main memory to local memory (either to register or cache) prior to execution.…”
Section: Related Workmentioning
confidence: 99%