Eric Busta scite author profile

Eric Busta

3Publications

40Citation Statements Received

11Citation Statements Given

How they've been cited

How they cite others

Affiliations

Advanced Micro Devices (United States)

Publications

Order By: Most citations

Design solutions for the Bulldozer 32nm SOI 2-core processor module in an 8-core CPU

Fischer

Arekapudi

Busta

et al. 2011

View full text Add to dashboard Cite

AMD's 2-core "Bulldozer" module contains 213 million transistors in an 11metal layer 32nm HKMG SOI CMOS process and is designed to operate from 0.8 to 1.3V. This new micro-architecture [1] improves performance and frequency while reducing area and power compared to a previous AMD x86-64 CPU in the same process [2]. To achieve these goals, the design reduced the number of FO4 inverter delays/cycle by more than 20%, achieving higher frequencies in the same power envelope even with increased core counts. The 2-core CPU module area (including 2MB L2 cache) is 30.9mm 2 (Fig. 4.5.7).The module design contains 84 unique custom macros and 317,000 scannable flops. Module-level VSS power gating (CC6) reduces leakage power by 95% when both cores are idle [2]. Transistor Vts across the design are mostly regular (47%) and long-channel regular (46%).The Bulldozer micro-architecture is cycle-based, using soft-edge flip-flops (SEF) to provide high-frequency performance, process variation tolerance, and low power consumption (Fig. 4.5.1). Performance and process tolerance are provided by a 2-clock design: early and late clocks (ECLK, LCLK) create a soft timing edge, allowing limited cycle stealing. Power is reduced in low-power SEFs by internally gated slave latch clocks. The majority of flops (78%) are low-power, using high-performance flops only on timing-critical paths.In contrast to leveraged power-optimized CPU designs [2,4], Bulldozer's groundup design requires co-development of power efficiency, timing, and functionality. Initially, micro-architectural power is optimized using a power-aware highlevel performance model. Next, before schematic completion, the team tracks and analyzes RTL-based clock and flip-flop activity (a proxy for switching power) to meet clock gating goals. Finally, a new power model enables early mixed schematic/layout analysis of transistor-level power. This enables aggressive power optimizations while the implementation is still malleable. The result is a design with low power consumption for typical applications, making it well-suited to active power management and boost (Fig. 4.5.2).The L1 caches are split, with I-cache residing in the instruction unit and a Dcache located in each load/store unit of the 2-cores. The 2-way, 64KB I-cache consists of an 8×2 array of 4KB bank macros, with 2 more arrays for pre-decode bits. Load/store area in the 2 cores is at a premium, so the D-cache uses a 4way 16KB array with performance features described later in the paper. Both L1 caches use an 8T storage cell. The change from a 6T cell in 45nm to 8T in 32nm was required to improve low-voltage margin and read timing and to reduce power. Use of the 8T cell also eliminated a difficult D-cache read-modify-write timing path. Reads use a 2-level pre-charged local/super bitline structure with delayed-onset keeper, single-rail, full-swing signals, and glitch latches.Several D-cache performance features reduce conflicts and power. First, microbanking reduces read conflicts to the same rate as a previous 16-bank 64KB desi...

show abstract

Design of the Two-Core x86-64 AMD “Bulldozer” Module in 32 nm SOI CMOS

McIntyre

Arekapudi

Busta

et al. 2012

IEEE J. Solid-State Circuits

View full text Add to dashboard Cite

Zen3: The AMD 2^nd-Generation 7nm x86-64 Microprocessor Core

Burd¹,

Li²,

Pistole³

et al. 2022

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Eric Busta

Design solutions for the Bulldozer 32nm SOI 2-core processor module in an 8-core CPU

Design of the Two-Core x86-64 AMD “Bulldozer” Module in 32 nm SOI CMOS

Zen3: The AMD 2^nd-Generation 7nm x86-64 Microprocessor Core

Contact Info

Product

Resources

About

Eric Busta

Design solutions for the Bulldozer 32nm SOI 2-core processor module in an 8-core CPU

Design of the Two-Core x86-64 AMD “Bulldozer” Module in 32 nm SOI CMOS

Zen3: The AMD 2nd-Generation 7nm x86-64 Microprocessor Core

Contact Info

Product

Resources

About

Zen3: The AMD 2^nd-Generation 7nm x86-64 Microprocessor Core