For cost-sensitive or memory constrained embedded systems, code size is at least as important as performance. Consequently, compact code generation has become a major focus of attention within the compiler community. In this paper we develop a pragmatic, yet effective code size reduction technique, which exploits structural similarity of functions. It avoids code duplication through merging of similar functions and targeted insertion of control flow to resolve small differences. We have implemented our purely software based and platform-independent technique in the LLVM compiler framework and evaluated it against the SPEC CPU2006 benchmarks and three target platforms: INTEL X86, ARM based QUALCOMM KRAIT TM , and QUALCOMM HEXAGON TM DSP. We demonstrate that code size for SPEC CPU2006 can be reduced by more than 550KB on X86. This corresponds to an overall code size reduction of 4%, and up to 11.5% for individual programs. Overhead introduced by additional control flow is compensated for by better Icache performance of the compacted programs. We also show that identifying suitable candidates and subsequent merging of functions can be implemented efficiently.
Just-in-time compilers are invoked during application execution and therefore need to ensure fast compilation times. Consequently, runtime compiler designers are averse to implementing compile-time intensive optimization algorithms. Instead, they tend to select faster but less effective transformations. In this paper, we explore this trade-off for an important optimization -global register allocation. We present a graph-coloring register allocator that has been redesigned for runtime compilation. Compared to ChaitinBriggs [7], a standard graph-coloring technique, the reformulated algorithm requires considerably less allocation time and produces allocations that are only marginally worse than those of Chaitin-Briggs. Our experimental results indicate that the allocator performs better than the linear-scan and Chaitin-Briggs allocators on most benchmarks in a runtime compilation environment. By increasing allocation efficiency and preserving optimization quality, the presented algorithm increases the suitability and profitability of a graph-coloring register allocation strategy for a runtime compiler.
Abstract. Techniques for global register allocation via graph coloring have been extensively studied and widely implemented in compiler frameworks. This paper examines a particular variant -the Callahan Koblenz allocator -and compares it to the Chaitin-Briggs graph coloring register allocator. Both algorithms were published in the 1990's, yet the academic literature does not contain an assessment of the Callahan-Koblenz allocator. This paper evaluates and contrasts the allocation decisions made by both algorithms. In particular, we focus on two key differences between the allocators: Spill code: The Callahan-Koblenz allocator attempts to minimize the effect of spill code by using program structure to guide allocation and spill code placement. We evaluate the impact of this strategy on allocated code. Copy elimination: Effective register-to-register copy removal is important for producing good code. The allocators use different techniques to eliminate these copies. We compare the mechanisms and provide insights into the relative performance of the contrasting techniques. The Callahan-Koblenz allocator may potentially insert extra branches as part of the allocation process. We also measure the performance overhead due to these branches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.