Nikolai Kim scite author profile

Nikolai Kim

2Publications

3Citation Statements Received

41Citation Statements Given

How they've been cited

How they cite others

Affiliations

TU Wien

Publications

Order By: Most citations

Integrated modulo scheduling and cluster assignment for TI TMS320C64x+ architecture

Kim

Krall

2014

View full text Add to dashboard Cite

For the exploitation of the available parallelism clustered Very Long Instruction Word (VLIW) processors rely on highly optimizing compilers. Aiming this parallelism, many advanced compiler concepts have been developed and proposed in the past. Many of them concentrate on loops only as most of the execution time is usually spent executing repeating patterns of code. Software pipelining techniques, such as modulo scheduling, try to speed up the execution of loops by simultaneous initiation of multiple iterations, thus additionally exploiting parallelism across loop iteration boundaries. This increases processor utilization at the cost of higher complexity which is especially true for architectures featuring multiple clusters and distributed register files. Additional scheduling constraints need to be considered in order to produce valid schedules. Targeting TI's TMS320C64x+ clustered VLIW architecture, we describe a code generation approach that adapts an iterative modulo scheduling scheme, and also propose two heuristics for cluster assignment, all together implemented within the popular LLVM compiler framework. We cover implementation of developed algorithms, present evaluation results for a selection of benchmarks popular for embedded system development and discuss gained insights on the topics of integrated modulo scheduling and cluster assignment in this paper.

show abstract

IR-level versus machine-level if-conversion for predicated architectures

Jordan

Kim

Krall

2013

View full text Add to dashboard Cite

If-conversion is a simple yet powerful optimization that converts control dependences into data dependences. It allows elimination of branches and increases available instruction level parallelism and thus overall performance. If-conversion can either be applied alone or in combination with other techniques that increase the size of scheduling regions. The presence of hardware support for predicated execution allows if-conversion to be broadly applied in a given program. This makes it necessary to guide the optimization using heuristic estimates regarding its potential benefit. Similar to other transformations in an optimizing compiler, if-conversion inherently su↵ers from phase ordering issues. Driven by these facts, we developed two algorithms for if-conversion targeting the TI TMS320C64x+ architecture within the LLVM framework. Each implementation targets a di↵erent level of code abstraction. While one targets the intermediate representation, the other addresses machine-level code. Both make use of an adapted set of estimation heuristics and prove to be successful in general, but each one exhibits di↵erent strengths and weaknesses. High-level if-conversion, applied before other control flow transformations, has more freedom to operate. But in contrast to its machine-level counterpart, which is more restricted, its estimations of runtime are less accurate. Our results from experimental evaluation show a mean speedup close to 14% for both algorithms on a set of programs from the MiBench and DSPstone benchmark suites. We give a comparison of the implemented optimizations and discuss gained insights on the topics of ifconversion, phase ordering issues and profitability analysis.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nikolai Kim

Integrated modulo scheduling and cluster assignment for TI TMS320C64x+ architecture

IR-level versus machine-level if-conversion for predicated architectures

Contact Info

Product

Resources

About