Abstract:Recent work on compilation for DSP-processors deals with optimizing access to local variables of functions. The common way is to use one or more address registers as pointers into the functions stack frame and modify it with post modify addressing modes (which are sometimes the only addressing modes). Additionally to previous work we present an algorithm which assigns frame pointer values over a whole procedure. Our algorithm also deals with basic blocks, which have no accesses to local variables. The algorith… Show more
“…a[i+2] = avg * 3; avg += *p++ << 2; (5) } *p--= avg * 3; (6) if (avg < error) } (7) avg -= a[i+1] -error/2; if (avg < error) (8) else avg -= *p++ -error/2; (9) avg -= a[i+2] -error; else { (10) } p += 1; (11) avg -= *p -error; (12) } (13) } (a) (b) First of all assume, for the rest of this paper, that the array data type is a memory word (a typical characteristic of embedded programs). Moreover, assume that each array reference is atomic (i.e.…”
Section: Basic Conceptsmentioning
confidence: 99%
“…This technique is called Live Range Growth (LRG). Merge operations for similar problems have also been studied in [14,9]. The cost of merging two ranges R and S (cost 1 (R, S)) is the total number of cycles of the update instructions required by the merge.…”
Section: Basic Conceptsmentioning
confidence: 99%
“…We call this algorithm the LRO Algorithm, and show its pseudo-code in Alg. (4) for all but the last vertex vp ∈ DG φ , according to LRO, do (5) let (v p,v q ) be the last edge incident to v p (6) for j ← 1 to m do (7) min ← +∞ (8) for i ← 1 to m do (9) if…”
Abstract. The increasing demand for wireless devices running mobile applications has renewed the interest on the research of high performance low power processors that can be programmed using very compact code. One way to achieve this goal is to design specialized processors with short instruction formats and shallow pipelines. Given that it enables such architectural features, indirect addressing is the most used addressing mode in embedded programs. This paper analyzes the problem of allocating address registers to array references in loops using auto-increment addressing mode. It leverages on previous work, which is based on a heuristic that merges address register live ranges. We prove, for the first time, that the merge operation is NP-hard in general, and show the existence of an optimal linear-time algorithm, based on dynamic programming, for a special case of the problem.
“…a[i+2] = avg * 3; avg += *p++ << 2; (5) } *p--= avg * 3; (6) if (avg < error) } (7) avg -= a[i+1] -error/2; if (avg < error) (8) else avg -= *p++ -error/2; (9) avg -= a[i+2] -error; else { (10) } p += 1; (11) avg -= *p -error; (12) } (13) } (a) (b) First of all assume, for the rest of this paper, that the array data type is a memory word (a typical characteristic of embedded programs). Moreover, assume that each array reference is atomic (i.e.…”
Section: Basic Conceptsmentioning
confidence: 99%
“…This technique is called Live Range Growth (LRG). Merge operations for similar problems have also been studied in [14,9]. The cost of merging two ranges R and S (cost 1 (R, S)) is the total number of cycles of the update instructions required by the merge.…”
Section: Basic Conceptsmentioning
confidence: 99%
“…We call this algorithm the LRO Algorithm, and show its pseudo-code in Alg. (4) for all but the last vertex vp ∈ DG φ , according to LRO, do (5) let (v p,v q ) be the last edge incident to v p (6) for j ← 1 to m do (7) min ← +∞ (8) for i ← 1 to m do (9) if…”
Abstract. The increasing demand for wireless devices running mobile applications has renewed the interest on the research of high performance low power processors that can be programmed using very compact code. One way to achieve this goal is to design specialized processors with short instruction formats and shallow pipelines. Given that it enables such architectural features, indirect addressing is the most used addressing mode in embedded programs. This paper analyzes the problem of allocating address registers to array references in loops using auto-increment addressing mode. It leverages on previous work, which is based on a heuristic that merges address register live ranges. We prove, for the first time, that the merge operation is NP-hard in general, and show the existence of an optimal linear-time algorithm, based on dynamic programming, for a special case of the problem.
“…The allocation of local variables to the stack-frame, using auto-increment (decrement) mode, has been studied in Refs. [5,15,[22][23][24]28].…”
Section: Previous Workmentioning
confidence: 99%
“…identify the array references in L; partitioning them (3) in P 1 ; …; P K ; (4) for each P j ; 1 # j # K do (5) Compute_Minimum_Costs(P j ); (6) Optimal_AR_Distribution({P 1 ; …; P K }; C); (7) (8) procedure Compute_Minimum_Costs(P j ) (9) fill in C 0j with the cost estimated if no address (10) register is allocated to P j ; (11) C ijˆþ 1; 1 # i # R; (12) for each combination of partitioning the references (13) in P j in live ranges LR 1 ; …; LR i ; 1 # i # R do (14) total_costˆ0; (15) for each LR k ; 1 # k # i; do (16) build DG f for LR k ; (17) if DG f is a tree then (18) costˆLRO_cost(LR k ); (19) else (20) costˆBrute_Force_costðLR k Þ;…”
Efficient address register allocation has been shown to be a central problem in code generation for processors with restricted addressing modes. This paper extends previous work on Global Array Reference Allocation (GARA), the problem of allocating address registers to array references in loops. It describes two heuristics to the problem, presenting experimental data to support them. In addition, it proposes an approach to solve GARA optimally which, albeit computationally exponential, is useful to measure the efficiency of other methods. Experimental results, using the MediaBench benchmark and profiling information, reveal that the proposed heuristics can solve the majority of the benchmark loops near optimality in polynomial-time. A substantial execution time speedup is reported for the benchmark programs, after compiled with the original and the optimized versions of GCC. q
Abstract. Offset assignment is a highly effective DSP address code optimization technique that has been implemented in a number of ANSI C compilers. In this paper we concentrate on a special class of offset assignment problems called "simple offset assignment" (SOA). A number of SOA algorithms have been proposed recently, but experimental results and direct comparisons are still sparse. This makes the practical selection of a suitable SOA algorithm for implementation in a compiler very difficult. This paper aims at closing this gap by providing a comprehensive benchmark suite and empirical evaluation based on real-life application programs. Our results for the first time permit a detailed assessment of all major SOA algorithms. In addition, we propose a new and superior combination of SOA heuristics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.