Abstract:Abstract. Instruction selection for embedded processors is a challenging problem. Embedded system architectures feature highly irregular instruction sets and complex data paths. Traditional code generation techniques have difficulties to fully utilize the features of such architectures and typically result in inefficient code. In this paper we describe an instruction selection technique that uses static single assignment graphs (SSA-graphs) as underlying data structure for selection. Patterns defined as graph … Show more
“…Together with instruction scheduling and register allocation, instruction selection has also been approached using integer programming [8,25,54] and constraint programming [7,23,44]. Far fewer methods exist for global instruction selection, which so far only has been approached as a partitioned Boolean quadratic problem [11,18,19] or using constraint programming [29]. Common among these techniques is that they are restricted to tree-based and DAG-based patterns, whereas our approach can operate on full-fledged pattern graphs.…”
In code generation, instruction selection chooses processor instructions to implement a program under compilation where code quality crucially depends on the choice of instructions. Using methods from combinatorial optimization, this paper proposes an expressive model that integrates global instruction selection with global code motion. The model introduces (1) handling of memory computations and function calls, (2) a method for inserting additional jump instructions where necessary, (3) a dependency-based technique to ensure correct combinations of instructions, (4) value reuse to improve code quality, and (5) an objective function that reduces compilation time and increases scalability by exploiting bounding techniques. The approach is demonstrated to be complete and practical, competitive with LLVM, and potentially optimal (w.r.t. the model) for medium-sized functions. The results show that combinatorial optimization for instruction selection is well-suited to exploit the potential of modern processors in embedded systems.
“…Together with instruction scheduling and register allocation, instruction selection has also been approached using integer programming [8,25,54] and constraint programming [7,23,44]. Far fewer methods exist for global instruction selection, which so far only has been approached as a partitioned Boolean quadratic problem [11,18,19] or using constraint programming [29]. Common among these techniques is that they are restricted to tree-based and DAG-based patterns, whereas our approach can operate on full-fledged pattern graphs.…”
In code generation, instruction selection chooses processor instructions to implement a program under compilation where code quality crucially depends on the choice of instructions. Using methods from combinatorial optimization, this paper proposes an expressive model that integrates global instruction selection with global code motion. The model introduces (1) handling of memory computations and function calls, (2) a method for inserting additional jump instructions where necessary, (3) a dependency-based technique to ensure correct combinations of instructions, (4) value reuse to improve code quality, and (5) an objective function that reduces compilation time and increases scalability by exploiting bounding techniques. The approach is demonstrated to be complete and practical, competitive with LLVM, and potentially optimal (w.r.t. the model) for medium-sized functions. The results show that combinatorial optimization for instruction selection is well-suited to exploit the potential of modern processors in embedded systems.
“…By backpropagating the reductions, the selection of the smaller instance can be extended to a selection of the original PBQP instance. Originally, there are four reductions [5,9,16]: RE: Independent edges have a cost matrix that can be decomposed into two vectors u and v, i.e. each matrix entry C ij has costs u i + v j .…”
Abstract. Recent research shows that maintaining SSA form allows to split register allocation into separate phases: spilling, register assignment and copy coalescing. After spilling, register assignment can be done in polynomial time, but copy coalescing is NP-complete. In this paper we present an assignment approach with integrated copy coalescing, which maps the problem to the Partitioned Boolean Quadratic Problem (PBQP). Compared to the state-of-the-art recoloring approach, this reduces the relative number of swap and copy instructions for the SPEC CINT2000 benchmark to 99.6% and 95.2%, respectively, while taking 19% less time for assignment and coalescing.
“…This works fine for all our test programs and covers most, but not all, dependencies between rich instructions. Therefore, our current research is to use a heuristic PBQP 4 -solver for rule selection [12,13].…”
Abstract. We present a compiler internal program optimization that uses graph rewriting. This optimization enables the compiler to automatically use rich instructions (such as SIMD instructions) provided by modern CPUs and is transparent to the user of the compiler. New instructions can be introduced easily by specifying their behaviour in a high-level programming language. The optimization is integrated into an existing compiler, gaining high speedup.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.