MIST: an algorithm for memory miss traffic management

Grun, Peter; Dutt, Nikil; Nicolau, Alex

doi:10.1109/iccad.2000.896510

Cited by 14 publications

(7 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As described earlier, we have already used this memory-aware ADL to generate a compiler [Grun et al 2000a] and manage the memory miss traffic [Grun et al 2000b], resulting in significantly improved performance. We performed comparative studies with the MULTI integrated development environment (IDE) version 3.5 from Green Hills Software Inc. [2003] for the MIPS R4000 processor.…”

Section: Methodsmentioning

confidence: 99%

“…Section 4 shows an example of performance improvement due to this detailed memory subsystem timing information [Grun et al 2000a]. Such aggressive optimizations in the presence of efficient memory access modes (e.g., page/burst modes) and cache hierarchies [Grun et al 2000b] are only possible due to the explicit representation of the detailed memory architecture. We generate memory simulator (shown shaded in Figure 1) that is integrated into the SIMPRESS ] simulator, allowing for detailed feedback on the memory subsystem architecture and its match to the target applications.…”

Section: Our Approachmentioning

confidence: 99%

See 1 more Smart Citation

Processor-memory coexploration using an architecture description language

Mishra

Mamidipaka

Dutt

2004

ACM Trans. Embed. Comput. Syst.

View full text Add to dashboard Cite

Memory represents a major bottleneck in modern embedded systems in terms of cost, power, and performance. Traditionally, memory organizations for programmable embedded systems assume a fixed cache hierarchy. With the widening processor-memory gap, more aggressive memory technologies and organizations have appeared, allowing customization of a heterogeneous memory architecture tuned for specific target applications. However, such a processor-memory coexploration approach critically needs the ability to explicitly capture heterogeneous memory architectures. We present in this paper a language-based approach to explicitly capture the memory subsystem configuration, generate a memory-aware software toolkit, and perform coexploration of the processor-memory architectures. We present a set of experiments using our memory-aware architectural description language (ADL) to drive the exploration of the memory subsystem for the TI C6211 processor architecture, demonstrating cost, performance, and energy trade-offs.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Our Approachmentioning

confidence: 99%

Processor-memory coexploration using an architecture description language

Mishra

Mamidipaka

Dutt

2004

ACM Trans. Embed. Comput. Syst.

View full text Add to dashboard Cite

show abstract

“…The references rX , rZ , rY and rW are not part of a reuse pair, since memory lines W, X, Y and Z are accessed only once in the stream. Reuse pair r 1 A , r 2 A has reuse distance 4, while the reuse pair r 2 A , r 3 A has reuse distance 0. The forward reuse distance of r 1 A is 4, its backward reuse distance is ∞.…”

Section: Reuse Distancementioning

confidence: 99%

Reuse Distance-Based Cache Hint Selection

Beyls

D’Hollander

2002

Euro-Par 2002 Parallel Processing

View full text Add to dashboard Cite

Modern instruction sets extend their load/store-instructions with cache hints, as an additional means to bridge the processor-memory speed gap. Cache hints are used to specify the cache level at which the data is likely to be found, as well as the cache level where the data is stored after accessing it. In order to improve a program's cache behavior, the cache hint is selected based on the data locality of the instruction. We represent the data locality of an instruction by its reuse distance distribution. The reuse distance is the amount of data addressed between two accesses to the same memory location. The distribution allows to efficiently estimate the cache level where the data will be found, and to determine the level where the data should be stored to improve the hit rate. The Open64 EPIC-compiler was extended with cache hint selection. Execution on an Itanium multiprocessor shows speedups of up to 36% in numerical and 23% in non-numerical programs.

show abstract

“…Then, the technique was extended so that it can be well incorporated into loop tiling. More recently, Grun et al presented a new compiler optimization which uses accurate memory access timing information for both cache hits and misses, and schedules instructions so that memory accesses are efficiently overlapped [23].…”

Section: B Memory-aware Code Generationmentioning

confidence: 99%

New directions in compiler technology for embedded systems

Dutt

Nicolau²,

Tomiyama³

et al.

Proceedings of the ASP-DAC 2001. Asia and South Pacific Design Automation Conference 2001 (Cat. No.01EX455)

Self Cite

View full text Add to dashboard Cite

Traditionally, compiler technology has focused on the generation of code with the goal of improving performance for a variety of applications running on general-purpose processor architectures. In the embedded system space, compiler technology is faced with many new challenges, including: code generation for specialized architectural features, requiring a highly flexible degree of retargetability; memory-aware code generation that exploits the timing and structure of the embedded system's memory organization; optimizing software to meet both real-time, and performance constraints; energy-and power-aware software generation, both from the context of energy minimization, as well as power modulation; code size minimization for memory-constrained embedded systems; coarse-grain transformations for tightly-coupled, memory-constrained multi-processor architectures; and interaction with the operating system for active management of embedded system resources. This paper discusses new directions for compiler technology, surveys some of the current research efforts and illustrates proposed solutions t o selected issues.

show abstract

MIST: an algorithm for memory miss traffic management

Cited by 14 publications

References 21 publications

Processor-memory coexploration using an architecture description language

Processor-memory coexploration using an architecture description language

Reuse Distance-Based Cache Hint Selection

New directions in compiler technology for embedded systems

Contact Info

Product

Resources

About