R. Sautter scite author profile

Increasing demand for parallelism due to out-of-order and multi-threading computation requires fast and dense arrays with multi-port capabilities. The loadstore-unit (LSU) of the POWER7 TM microprocessor core has a 32kB L1 data cache composed of four 8kB blocks. In a two-cycle back-to-back operation it supports concurrently two independent read and one write operations. Organized in banks of 16 cells each, the two reads operate independently in any of these banks, including two reads within the same bank, even the same cell. A bank selected for write is blocked for any read operation. If read and write collide within the same bank, collision-control circuitry provides write-over-read priority. Each read port provides 4B from 1 of 256 locations, whereas the double-bandwidth write operation provides individual control of 8B to 128 locations. Figure 19.2.1 shows the back-to-back data cache loop. The two operand muxes select between the general purpose registers (GPR), the feedback loop and other read port bypass operands, the result goes into an adder stage that generates the read addresses (AGEN). The array output data passes through a formatter stage and then the result is driven back to the operand mux inputs. The cycle boundary at the array macro input is balanced between the two cycles to optimize the operating frequency, which is effectively determined the whole back-toback loop rather than by the actual data cache access. Figure 19.2.2 shows the read/write-decoding scheme using a standard 6T-SRAM cell in an effectively triple-port array. In an 8kB instance, the 256 entries are grouped into 16 banks of 16 6T-SRAM cells each. The 0.462µm 2 SRAM cell drives a low bitline (BL) load and is optimized for performance; the two passgate devices are connected to separate wordlines (WLs), wl_t and wl_c, and local BLs, blt and blc. Single-ended reads are initiated by activating a WL connected to one of the pass-gates. For a write operation both WLs of a given cell are active for a differential write. The two-stage decoder is organized in a bank select (msb) and a row select within a bank (lsb). In a read/write-bank-control stage the collision case is handled. If a bank m is not selected for a write (wr_msb = 0) the read address takes control (rd_msb = rd_lsb = 1) with each of the two read ports controlling a different WL. If both read addresses are the same, the two WLs wl_t and wl_c of the same cell are selected for read, which is supported. If a bank m is selected for a write (wr_msb = 1), it takes control independent of the read address by overwriting rd[0,1]_msb of both read ports to zero, blocking the read addresses for this bank. Since the write msb and lsb signals are connected to the read/write bank-control circuits of both read ports, it is guaranteed that both WLs of the same cell entry (wl_t and wl_c) are now selected, which is a prerequisite to guarantee a differential SRAM cell write scheme. Whether the cell is finally read or written also depends on the local read/write control circuitry, whi...

show abstract

A 1.8 GHz Instruction Window Buffer

Leenstra

Pille²,

Mueler³

et al.

View full text Add to dashboard Cite

An Instruction Window Buffer (IWB) addresses the challenges in microprocessor designs beyond a GHz. The IWB implements the processor parts for renaming, reservation station and reorder buffer as a unified buffer. Measured results on an experimental chip demonstrate operation of the IWB macros at 1.8GHz, with the chip at the fast end of the process distribution. The technology is 0.18µm CMOS8S bulk technology, with 7 levels of copper interconnect and a 1.5V supply. The IWB is implemented using static and delayed reset dynamic circuit macros [1].

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

R. Sautter

A 1.8-GHz instruction window buffer for an out-of-order microprocessor core

A Low Power and High Performance SOI SRAM Circuit Design with Improved Cell Stability

The Vector Fixed Point Unit of the Synergistic Processor Element of the Cell Architecture Processor

A 32kB 2R/1W L1 data cache in 45nm SOI technology for the POWER7TM processor

A 1.8 GHz Instruction Window Buffer

Contact Info

Product

Resources

About