The DIVA (Data IntensiVe Architecture) system incorporates a collection of Processing-In-Memory (PIM) chips as smart-memory co-processors to a conventional microprocessor. We have recently fabricated prototype DIVA PIMs. These chips represent the first smart-memory devices designed to support virtual addressing and capable of executing multiple threads of control. In this paper, we describe the prototype PIM architecture. We emphasize three unique features of DIVA PIMs, namely, the memory interface to the host processor, the 256-bit wide datapaths for exploiting on-chip bandwidth, and the address translation unit. We present detailed simulation results on eight benchmark applications. When just a single PIM chip is used, we achieve an average speedup of 3.3X over host-only execution, due to lower memory stall times and increased fine-grain parallelism. These 1-PIM results suggest that a PIM-based architecture with many such chips yields significantly higher performance than a multiprocessor of a similar scale and at a much reduced hardware cost.
This paper describes the implementation of theexception handling mechanism in the second prototype version of the Data-Intensive Architecture (DIVA) processing-inmemory (PIM) chip. This implementation features architectural simplicity, low area (54289 p 2 ) , delay (2.643 nanosecond) and power consumption (7.6 milliwatts), and effective hardware support for complex cases of exception handling. This work provides a description of handling memory-access, execution and communication-related exceptions in an area-and powerefficient manner, which are key design specifications for DIVA.The current implementation has been tested by verifying various exceptions on DIVA-I1 PIM chips running at 140MHz in the memory system of a HP Itanium2-based Long's Peak server. The generic nature of the DIVA exceptions and their classification makes the current implementation suitable and easy for use in diverse microarchitectures with little modification.
This paper describes the implementation of an areaefficient and protected user memory-mapped network interface, the pbuf (Parcel Buffer), for the Data IntensiVe Architecture (DIVA) processing-in-memory (PIM) system. This implementation of the pbuf in TSMC 0.18 µm CMOS technology displays an aggregate bi-directional throughput of 48.08Gbps, using low area (0.56 mm 2 ) and power consumption (32.30mW). These characteristics, especially the low area and power, have made the current implementation an ideal choice for assimilation in DIVA PIM systems, since low area and power are critical design requirements in the PIM philosophy. The pbuf implementation has been verified by the execution of a 2-PIM Transitive Closure benchmark at 140MHz on an HP Itanium2-based Long's Peak server containing DIMMs populated with DIVA-II PIM chips.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.