Vladimir Uzelac scite author profile

Vladimir Uzelac

5Publications

61Citation Statements Received

63Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Alabama in Huntsville

Publications

Order By: Most citations

Experiment flows and microbenchmarks for reverse engineering of branch predictor structures

Uzelac

Milenkovic

2009

View full text Add to dashboard Cite

1.INTRODUCTION Branch predictors are one of the key units in the front-end of modern high-performance microprocessors. They detect branches and predict the branch target address and the branch outcome in the early pipeline stages, thus reducing the number of wasted clock cycles due to control hazards. The target of a direct branch is predicted using a branch target buffer (BTB) [1] -a cache structure indexed by a portion of the branch address. Each BTB entry typically includes the tag field, the offset field, the branch type field (e.g., direct/indirect, unconditional/conditional), the valid bit, the replacement bits for multi-way BTBs, and the target address. A separate hardware structure named an indirect branch target buffer (iBTB) can be employed to handle indirect branches with multiple target addresses [2][3][4]. The branch outcome predictors have evolved from a simple linear branch history table (BHT) with 2-bit saturating counters (2bc) [5] to very sophisticated branch predictor structures found in recent commercial microprocessors [6][7][8][9]. A number of advanced predictor structures have been proposed, including (i) twolevel adaptive predictors that exploit global or local branch histories of branch outcomes to achieve a better mapping into the BHT [10,11] (ii) de-interference predictors which reduce negative effects of branch interference [12][13][14][15], (iii) hybrid predictors that include multiple specialized structures [16][17][18], and (iv) perceptron predictors [19,20].Code optimizations based on the information about branch predictor structures can greatly increase overall program performance [21,22]. For example, if the compiler is aware of the BTB size and organization, it can prevent branch interference in critical portions of the code by re-aligning the branch instructions. Next, if the compiler is aware of local and global branch history lengths, it can employ code duplication or loop unrolling transformations to alleviate mispredictions [21]. Jimenez introduced the Camino C compiler [22] that exploits knowledge about branch predictor internal structures. It performs feedback-directed code placement to reduce the number of branch mispredictions in the NetBurst architecture. This optimization reduces the number of branch mispredictions in the SPEC CPU2K benchmarks in the range of 22% to 3.5%.Unfortunately, microprocessor manufacturers rarely fully disclose information about the branch predictor organization thus preventing efforts aimed at better code optimization. This problem can be addressed by employing reverse engineering techniques aimed at branch predictor units. A prior reverse engineering flow focusing on P6 and NetBurst architectures [21] has been successful in determining the size and organization of the BTB and the presence and lengths of global and local histories. However, this flow does not include any experiments for determining the organization of predictor structures indexed by program path information nor their internal operation. In addition, it does not incl...

show abstract

Real-time unobtrusive program execution trace compression using branch predictor events

Uzelac¹,

Milenković

Burtscher

et al. 2010

View full text Add to dashboard Cite

A real-time program trace compressor utilizing double move-to-front method

Uzelac

Milenkovic

2009

View full text Add to dashboard Cite

This paper introduces a new unobtrusive and cost-effective method for the capture and compression of program execution traces in real-time, which is based on a double move-to-front transformation. We explore its effectiveness and describe a costeffective hardware implementation. The proposed trace compressor requires only 0.12 bits per instruction of trace port bandwidth, at the cost of 25K gates.

show abstract

Caches and Predictors for Real-Time, Unobtrusive, and Cost-Effective Program Tracing in Embedded Systems

Milenković

Uzelac

Milenković³

et al. 2011

IEEE Trans. Comput.

View full text Add to dashboard Cite

Using Branch Predictors and Variable Encoding for On-the-Fly Program Tracing

Uzelac¹,

Milenković

Milenković³

et al. 2014

IEEE Trans. Comput.

View full text Add to dashboard Cite

Abstract-Unobtrusive capturing of program execution traces in real-time is crucial for debugging many embedded systems. However, tracing even limited program segments is often cost-prohibitive, requiring wide trace ports and large on-chip trace buffers. This paper introduces a new cost-effective technique for capturing and compressing program execution traces on-thefly. It relies on branch predictor-like structures in the trace module and corresponding software modules in the debugger to significantly reduce the number of events that need to be streamed out of the target system. Coupled with an effective variable encoding scheme that adapts to changing program patterns, our technique requires merely 0.029 bits per instruction of trace port bandwidth, providing a 34-fold improvement over the commercial state-of-the-art and a five-fold improvement over academic proposals, at the low cost of under 5,000 logic gates.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Vladimir Uzelac

Experiment flows and microbenchmarks for reverse engineering of branch predictor structures

Real-time unobtrusive program execution trace compression using branch predictor events

A real-time program trace compressor utilizing double move-to-front method

Caches and Predictors for Real-Time, Unobtrusive, and Cost-Effective Program Tracing in Embedded Systems

Using Branch Predictors and Variable Encoding for On-the-Fly Program Tracing

Contact Info

Product

Resources

About