Caches enhance the performance of multiprocessors by reducing network tra c and average memory access latency. However, cache-based systems must address the problem of cache coherence. We propose the LimitLESS directory protocol to solve this problem. The LimitLESS scheme uses a combination of hardware and software techniques to realize the performance of a full-map directory with the memory overhead of a limited directory. This protocol is supported by Alewife, a large-scale multiprocessor. We describe the architectural interfaces needed to implement the LimitLESS directory, and evaluate its performance through simulations of the Alewife machine.
Multiprocessor architects have begun to explore several mechanisms such as prefetching, context-switching and software-assisted dynamic cache-coherence, which transform single-phase memory transactions in conventional memory systems into multiphase operations. Multiphase operations introduce a window of vulnerability in which data can be invalidated before it is used. Losing data due to invalidations introduces damaging livelock situations. This thesis discusses the origins of the window of vulnerability and proposes an architectural framework that closes it. The framework employs fully-associative transaction-bu ers and an algorithm called thrashlock. It has been implemented as one facet of the Alewife machine, a large-scale cache-coherent multiprocessor.
Alewife is a multiprocessor architecture that suppons up to 5 I2 processmg nodes connected over a scalable and cost-effective mesh network at a constant cost per node. The MIT Alewife machine. a prototype implementation of the architecture. demonstrates that a parallel system can be both scalable and programmable. Four mechanisms combine to achieve these goals: software-extended coherent shared memory provides a global, linear address space; integrated message passing allows compiler and operating system designers to provrde efficient commumcation and synchronization: support for fine-gram computation allows many processors to cooperate on small problem srzes: and latency tolerance mechanisms -including block multithreading and prefetching -mask unavoidable delays due to communication.
Microbenchmarks.together with over a dozen complete npplications running on the 32-node prototype, help to analyze the behavior of the system. Analysis shows that integrating message passing with shared memory enables a cost-efficient solution to the cache coherence problem and provides a rich set of programming primitives. Block multithreading and prefetching improve performance by up to 25% individually. and 35% together. Finally. language constructs that allow programmers to express fine-grain synchronization can improve performance by over a factor of two.
A variety of models for parallel architectures, such as shared memory, message passing, and data flow, have converged in the recent past to a hybrid architecture form called distributed shared memory (DSM). By using a combination of hardware and software mechanisms, DSM combines the nice features of all the above models and is able to achieve both the scalability of messagepassing machines and the programmability of shared memory systems. Alewife, an early prototype of such DSM architectures, uses a hybrid of software and hardware mechanisms to support coherent shared memory, efficient user-level messaging, fine-grain synchronization, and latency tolerance. Alewife supports up to 512 processing nodes connected over a scalable and cost-effective mesh network at a constant cost per node. Four mechanisms combine to achieve Alewife's goals of scalability and programmability: software-extended coherent shared memory provides a global, linear address space; integrated message passing allows compiler and operating system designers to provide efficient communication and synchronization; support for fine-grain computation allows many processors to cooperate on small problem sizes; and latency-tolerance mechanisms-including block multithreading and prefetching-mask unavoidable delays due to communication.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.