The IBM POWER A processor is the dominant reduced instruction set computing microprocessor in the world today, with a rich history of implementation and innovation over the last 20 years. In this paper, we describe the key features of the POWER7 A processor chip. On the chip is an eight-core processor, with each core capable of four-way simultaneous multithreaded operation. Fabricated in IBM's 45-nm silicon-on-insulator (SOI) technology with 11 levels of metal, the chip contains more than one billion transistors. The processor core and caches are significantly enhanced to boost the performance of both single-threaded response-time-oriented, as well as multithreaded, throughput-oriented applications. The memory subsystem contains three levels of on-chip cache, with SOI embedded dynamic random access memory (DRAM) devices used as the last level of cache. A new memory interface using buffered double-data-rate-three DRAM and improvements in reliability, availability, and serviceability are discussed.
POWER5 offers significantly increased performance over previous POWER designs by incorporating simultaneous multithreading, an enhanced memory subsystem, and extensive RAS and power management support. The 276M transistor processor is implemented in 130nm silicon-on-insulator technology with 8-level of Cu metallization and operates at >1.5 GHz.
General TermsDesign Keywords POWER5, Microprocessor Design, Simultaneous Multi-threading (SMT), Temperature Sensor, Power Reduction, Clock Gating POWER5 TM is the next generation of IBM's POWER microprocessors. This design, shown below in Figure 1, sets a new standard of industry-leading server performance by incorporating simultaneous multithreading (SMT), an enhanced distributed switch and memory subsystem supporting 1-64w SMP, and extensive RAS support. First pass hardware using IBM's 130nm silicon-on-insulator technology operates above 1.5GHz at 1.3V.POWER5's dual-threaded SMT [1] creates up to two virtual processors per core, improving execution unit utilization and masking memory latency. Although a simplistic SMT implementation promised ~20% performance improvement, resizing critical micro-architectural resources almost doubles in many cases the SMT performance benefit at a 24% area cost per core.The two SMT cores interface with an enhanced memory subsystem. The cache hierarchy includes a larger (1.9MB) L2 cache, reduced L3 latency, and a larger (36MB) L3 cache located on a custom DRAM companion chip. The new on-chip main memory controller improves latency and the enhanced interconnect fabric extends SMP scalability. Figure 2 depicts the microarchitectural changes introduced with POWER5 chip.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.