Ivan D. Baev scite author profile

Eichenberger³

2002

Algorithmica

Scheduling superblocks with bound-based branch trade-offs

2001

IEEE Trans. Comput.

Lower bounds on precedence-constrained scheduling for parallel processors

Scheduling superblocks with bound-based branch trade-offs

2001

IEEE Trans. Comput.

ÐSince instruction level parallelism in basic blocks is often limited, compilers increase performance by creating superblocks that allow operations to be issued speculatively. This is difficult in general because each branch competes for the processor's limited resources. Previous work manages the performance trade-offs that exist between branches only indirectly. We show here that dependence and resource constraints can be used to gather explicit knowledge about scheduling trade-offs between branches. This paper's first contribution is a set of new, tighter lower bounds on the execution times of superblocks that specifically account for the dependence and resource conflicts between pairs of branches. This paper's second contribution is a novel superblock scheduling heuristic that finds high performance schedules by determining the operations that each branch needs to be scheduled early and selecting branches with compatible needs that favor beneficial branch trade-offs. Performance evaluations for superblocks from SPECint95 indicate that our bounds are very tight and that our scheduling heuristic outperforms well-known superblock scheduling algorithms. Index TermsÐSuperblock, scheduling heuristic, lower bound, ILP compiler technique.

show abstract

Lower bounds on precedence-constrained scheduling for parallel processors

Information Processing Letters

2002

Prematerialization

Hank

Gross

2006

Modern compiler transformations that eliminate redundant computations or reorder instructions, such as partial redundancy elimination and instruction scheduling, are very effective in improving application performance but tend to create longer and potentially more complex live ranges. Typically the task of dealing with the increased register pressure is left to the register allocator. To avoid introduction of spill code which can reduce or completely eliminate the benefit of earlier optimizations, researchers have developed techniques such as live range splitting and rematerialization. This paper describes prematerialization (PM), a novel method for reducing register pressure for VLIW architectures with nop instructions. PM and rematerialization both select "never killed" live ranges and break them up by introducing one or more definitions close to the uses. However, while rematerialization is applied to live ranges selected for spilling during register allocation, PM relies on the availability of nop instructions and occurs prior to register allocation. PM simplifies register allocation by creating live ranges that are easier to color and less likely to spill. We have implemented prematerialization in HP-UX production compilers for the Intel ® Itanium ® architecture. Performance evaluation indicates that the proposed technique is effective in reducing register pressure inherent in highly optimized code.

show abstract

Untitled

Abraham

2002