Lorenzo Chelini scite author profile

DOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:

show abstract

Near-memory computing: Past, present, and future

Singh

Chelini

Corda

et al. 2019

Microprocessors and Microsystems

View full text Add to dashboard Cite

The conventional approach of moving data to the CPU for computation has become a significant performance bottleneck for emerging scale-out data-intensive applications due to their limited data reuse. At the same time, the advancement in 3D integration technologies has made the decade-old concept of coupling compute units close to the memory -called nearmemory computing (NMC) -more viable. Processing right at the "home" of data can significantly diminish the data movement problem of data-intensive applications.In this paper, we survey the prior art on NMC across various dimensions (architecture, applications, tools, etc.) and identify the key challenges and open issues with future research directions. We also provide a glimpse of our approach to near-memory computing that includes i) NMC specific microarchitecture independent application characterization ii) a compiler framework to offload the NMC kernels on our target NMC platform and iii) an analytical model to evaluate the potential of NMC.

show abstract

OCC: An Automated End-to-End Machine Learning Optimizing Compiler for Computing-In-Memory

Siemieniuk

Chelini

Khan

et al. 2022

IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst.

View full text Add to dashboard Cite

Coherently Attached Programmable Near-Memory Acceleration Platform and its application to Stencil Processing

Lunteren

Luijten

Diamantopoulos

et al. 2019

View full text Add to dashboard Cite

Declarative Loop Tactics for Domain-specific Optimization

Chelini

Zinenko

Grosser

et al. 2019

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

Increasingly complex hardware makes the design of effective compilers difficult. To reduce this problem, we introduce Declarative Loop Tactics, which is a novel framework of composable program transformations based on an internal tree-like program representation of a polyhedral compiler. The framework is based on a declarative C++ API built around easy-to-program matchers and builders, which provide the foundation to develop loop optimization strategies. Using our matchers and builders, we express computational patterns and core building blocks, such as loop tiling, fusion, and data-layout transformations, and compose them into algorithm-specific optimizations. Declarative Loop Tactics (Loop Tactics for short) can be applied to many domains. For two of them, stencils and linear algebra, we show how developers can express sophisticated domain-specific optimizations as a set of composable transformations or calls to optimized libraries. By allowing developers to add highly customized optimizations for a given computational pattern, we expect our approach to reduce the need for DSLs and to extend the range of optimizations that can be performed by a current general-purpose compiler. CCS Concepts: • Software and its engineering → Compilers;

show abstract

Polygeist: Raising C to Polyhedral MLIR

Moses¹,

Chelini

Zhao

et al. 2021

View full text Add to dashboard Cite

Progressive Raising in Multi-level IR

Chelini

Drebes²,

Zinenko³

et al. 2021

View full text Add to dashboard Cite

Multi-level intermediate representations (IR) show great promise for lowering the design costs for domain-specific compilers by providing a reusable, extensible, and non-opinionated framework for expressing domain-specific and high-level abstractions directly in the IR. But, while such frameworks support the progressive lowering of high-level representations to low-level IR, they do not raise in the opposite direction. Thus, the entry point into the compilation pipeline defines the highest level of abstraction for all subsequent transformations, limiting the set of applicable optimizations, in particular for general-purpose languages that are not semantically rich enough to model the required abstractions. We propose Progressive Raising, a complementary approach to the progressive lowering in multi-level IRs that raises from lower to higher-level abstractions to leverage domain-specific transformations for low-level representations. We further introduce Multi-Level Tactics, our declarative approach for progressive raising, implemented on top of the MLIR framework, and demonstrate the progressive raising from affine loop nests specified in a general-purpose language to high-level linear algebra operations. Our raising paths leverage subsequent high-level domain-specific transformations with significant performance improvements. Index Terms-MLIR, progressive raising, multi-level intermediate representation / * instantiate the context * / auto _i = m_Placeholder(), _j = m_Placeholder(); auto _A = m_ArrayPlaceholder(); auto matcher = m_Op(_A({2 * _i+1, _j+5})); Listing 6: Declarative access pattern matcher. For(For(For(For(access_callback())))); auto access_callback = [&a](Body loop) { { AccessPatternContext pctx(/ * MLIR ctx * /); auto _a = m_Placeholder(); auto _b = m_Placeholder(); auto _c = m_Placeholder(); auto _d = m_Placeholder(); auto _C = m_ArrayPlaceholder(); auto _A = m_ArrayPlaceholder(); auto _B = m_ArrayPlaceholder(); auto var0 = m_Op(_C({_a, _b, _c})); / * check the store is the last instruction in the block * / auto var1 = m_Op(_C({_a, _b, _c})); auto var2 = m_Op(_A({_a, _c, _d}));

show abstract

Automatic Generation of Multi-Objective Polyhedral Compiler Transformations

Chelini

Gysi

Grosser

et al. 2020

View full text Add to dashboard Cite

To this day, polyhedral optimizing compilers use either extremely rigid (but accurate) cost models, one-size-fits-all general-purpose heuristics, or auto-tuning strategies to traverse and evaluate large optimization spaces. In this paper, we introduce an adaptive and automatic scheduler that permits to generate novel loop transformation sequences (or recipes) capable of delivering strong performance for a variety of different architectures without relying on auto-tuning, nor on predetermined transformation strategies. We evaluate our approach using the Polybench/C benchmark suite against two modern state-of-the-art optimizers on three different architectures: An AMD ThreadRipper, an Intel Xeon Phi, and an Intel Xeon Platinum. Our results provide evidence that a set of high-level objectives backed up by an automatic adaptive scheduler (i.e., not hard-wired) is capable of achieving competitive performance, while only resorting to evaluating a handful of tuned variants.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lorenzo Chelini

A Review of Near-Memory Computing Architectures: Opportunities and Challenges

Near-memory computing: Past, present, and future

OCC: An Automated End-to-End Machine Learning Optimizing Compiler for Computing-In-Memory

Coherently Attached Programmable Near-Memory Acceleration Platform and its application to Stencil Processing

Declarative Loop Tactics for Domain-specific Optimization

Polygeist: Raising C to Polyhedral MLIR

Progressive Raising in Multi-level IR

Automatic Generation of Multi-Objective Polyhedral Compiler Transformations

Contact Info

Product

Resources

About