Ashok Sudarsanam scite author profile

We address the problem of code generation for DSP systems on a chip. In such systems, the amount of silicon devoted to program ROM is limited, so application software must be sufficiently dense. Additionally, the software must be written so as to meet various highperformance constraints, which may include hard real-time constraints. Unfortunately, current compiler technology is unable to generate high-quality code for DSPs, whose architectures are highly irregular. Thus, designers often resort to programming application software in assembly-a time-consuming task.In this paper, we focus on providing support for one architectural feature of DSPs that makes code generation difficult, namely multiple data memory banks. This feature increases memory bandwidth by permitting multiple data memory accesses to occur in parallel when the referenced variables belong to different data memory banks and the registers involved conform to a strict set of conditions. We present an algorithm that attempts to maximize the benefit of this architectural feature. While previous approaches have decoupled the phases of register allocation and memory bank assignment, thereby compromising code quality, our algorithm performs these two phases simultaneously. Experimental results demonstrate that our algorithm not only generates high-quality compiled code, but also improves the quality of completely-referenced code.

show abstract

Analysis and evaluation of address arithmetic capabilities in custom DSP architectures

Sudarsanam

Liao

Devadas³

1997

View full text Add to dashboard Cite

Many application-specific architectures provide indirect addressing modes with auto-increment/decrement arithmetic. Since these architectures generally do not feature an indexed addressing mode, stack-allocated variables must be accessed by allocating address registers and performing address arithmetic. Subsuming address arithmetic into auto-increment/decrement arithmetic improves both the performance and size of the generated code.Our objective in this paper is to provide a method for comprehensively analyzing the performance benefits and hardware cost due to an auto-increment/decrement feature that varies from ,l to +l, and allowing access to k address registers in an address generator. We provide this method via a parameterizable optimization algorithm that operates on a procedure-wise basis. Hence, the optimization techniques in a compiler can be used not only to generate efficient or compact code, but also to help the designer of a custom DSP architecture make decisions on address arithmetic features.We present two sets of experimental results based on selected benchmark programs: (1) the values of l and k beyond which there is little or no improvement in performance, and (2) the values of l and kwhich result in minimum code area.

show abstract

Challenges in Code Generation for Embedded Processors

Araújo

Devadas

Keutzer

et al. 2002

View full text Add to dashboard Cite

The effect of compiler-flag tuning on SPEC benchmark performance

Chan

Sudarsanam

Wolfe

1994

SIGARCH Comput. Archit. News

View full text Add to dashboard Cite

The SPEC CINT92 and CFP92 benchmark suites are application-based system benchmarks primarily intended for workstation-class system performance measurements. The SPEC CPU benchmark results are widely disseminated by system vendors and as such have become the de-facto standard for comparing system performance. Recently, many observers have expressed concerns about the suitability of published SPEC benchmark results in representing application performance on typical systems. The most outspoken concern is that there is too much freedom permitted in the manipulation of compiler flags. This has resulted in revisions to the SPEC reporting procedure.This paper presents and discusses many of the issues concerning the tuning of benchmarks through manipulation of compiler flags. We attempt to quantify the impact of these procedures through controlled experiments. Baseline performance results, using a set of uniform, common optimizations are compared to published data. Further experiments measure the performance of the SPEC benchmarks in the other common usage scenarios. These are a centralized file storage configuration and a system using common binaries among several implementations of the same architecture. Despite the great concern over the use of compiler flags in the SPEC community, our experiments show only a modest impact on performance. The more significant performance differential shown in the other experiments draws into question the utility of current SPEC data to many users.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.