Joseph I. Chamdani scite author profile

Joseph I. Chamdani

3Publications

6Citation Statements Received

4Citation Statements Given

How they've been cited

How they cite others

Affiliations

Oracle (United States)

Publications

Order By: Most citations

Low load latency through sum-addressed memory (SAM)

Lynch

Lauterbach

Chamdani

1998

SIGARCH Comput. Archit. News

View full text Add to dashboard Cite

Load latency contributes significantly to execution time. Because most cache accesses hit, cache-hit latency becomes an important component of expected load latency. Most modern microprocessors have base+offset addressing loads; thus effective cache-hit latency includes an addition as well as the RAM access.This paper introduces a new technique used in the UltraSPARC III microprocessor, Sum-Addressed Memory (SAM), which performs true addition using the decoder of the RAM array, with very low latency. We compare SAM with other methods for reducing the add part of load latency. These methods include sum-prediction with recovery, and bitwise indexing with duplicate-tolerance. The results demonstrate the superior performance of SAM.

show abstract

UltraSPARC-III: a 3rd generation 64 b SPARC microprocessor

Lauterbach

Greenley²,

Ahmed³

et al.

View full text Add to dashboard Cite

UltraSPARC-III (US-III) is a 64b 800MHz 4-instruction-issue superscalar microprocessor for high-performance desktop workstation, work group server, and enterprise server platforms. On-chip caches include a 64kB 4-way associative for data (D$), 32kB 4-way associative for instructions (I$), a 2kB 4-way associative data prefetch cache (P$), and a 2kB 4-way associative write (W$). A 90kB on-chip tag array supports the off-chip 8MB unified second-level cache (E$) [1]. The 23M-transistor chip in a 0.15µm, 7-layer metal process consumes 60W from a 1.5V supply [2].The architecture is driven by performance, scalability and compatibility. The design is SPARC V9-compliant, maintaining binary compatibility with all 10,000+ existing SPARC applications [3]. Scalability in two directions is required: 1) taking full entitlement of future process improvements to scale clock rate and 2) off-chip interfaces that enable scaling multi-processor (MP) systems to 1000+ processors. Performance can be achieved in multiple ways. Clock rate is prioritized over IPC improvements, setting a goal of 1.5x the clock rate compared to the previous designs in the same process technology, as well as IPC and compiler improvement goals of 1.15x each, for a doubling of overall performance[4]. This requires different approaches to the micro-architecture, as well as more aggressive circuit and physical design, compared to previous UltraSPARC processors [5,6]. 8 static gates are budgeted for each of the 14 pipeline stages vs. 9 stages and 20 static gates/stage on US-I/II. Timing is more critical in the instruction fetch, integer execution, and floating-point (FP) areas, where dynamic logic is used liberally, than in the memory system.

show abstract

Guidance, Navigation and Control Digital Emulation Technology Laboratory

Alford¹,

Chamdani²,

Huang³

et al. 1994

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.