iCFP: Tolerating all-level cache misses in in-order processors
AbstractGrowing concerns about power have revived interest in in-order pipelines. In-order pipelines sacrifice singlethread performance. Specifically, they do not allow execution to flow freely around data cache misses. As a result, they have difficulties overlapping independent misses with one another. Previously proposed techniques like Runahead execution and Multipass pipelining have attacked this problem. In this paper, we go a step further and introduce iCFP (in-order Continual Flow Pipeline), an adaptation of the CFP concept to an in-order processor. When iCFP encounters a primary data cache or 12 miss, it checkpoints the register file and transitions into an "advance " execution mode. Miss-independent instructions execute as usual and even update register state. Miss-dependent instructions are diverted into a slice buffer, un-blocking the pipeline latches. When the miss returns, iCFP "rallies" and executes the contents of the slice buffer, merging missdependent state with miss-independent state along the way. An enhanced register dependence tracking scheme and a novel store buffer design facilitate the merging process. Cycle-level simulations show that iCFP out-performs Runahead, Multipass, and SLTP, another non-blocking in-order pipeline design.Keywords multiprocessing systems, pipeline processing, Runahead execution, all-level cache, in-order continual flow pipeline, in-order pipelines, in-order processors, miss-independent instructions, multipass pipelining, register dependence tracking scheme, register file This material is posted here with permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE endorsement of any of the University of Pennsylvania's products or services. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to pubs-permissions@ieee.org. By choosing to view this document, you agree to all provisions of the copyright laws protecting it.This conference paper is available at ScholarlyCommons: http://repository.upenn.edu/cis_papers/410 iCFP: Tolerating All-Level Cache Misses in In-Order Processors Andrew Hilton, Santosh Nagarakatte, and Amir Roth Department of Computer and Information Science, University of Pennsylvania {adhilton, santoshn, amir}@cis.upenn.edu
AbstractGrowing concerns about power have revived interest in in-order pipelines. In-order pipelines sacrifice single-thread performance. Specifically, they do not allow execution to flow freely around data cache misses. As a result, they have difficulties overlapping independent misses with one another.Previously proposed techniques like Runahead execution and Multipass pipelining have attacked this problem. In this paper, we go a step further and introduce iCFP (in-order Continual Flow Pipeline), an adaptation of the CFP concept to an in-orde...