2021
DOI: 10.48550/arxiv.2105.03215
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Bring Your Own Codegen to Deep Learning Compiler

Zhi Chen,
Cody Hao Yu,
Trevor Morris
et al.

Abstract: Deep neural networks (DNNs) have been ubiquitously applied in many applications, and accelerators are emerged as an enabler to support the fast and efficient inference tasks of these applications. However, to achieve high model coverage with high performance, each accelerator vendor has to develop a full compiler stack to ingest, optimize, and execute the DNNs. This poses significant challenges in the development and maintenance of the software stack. In addition, the vendors have to contiguously update their … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 17 publications
0
5
0
Order By: Relevance
“…Genesis [26] is a DL compiler that integrates graph partitioning functionalities into TVM. Genesis has a similar structure to NEST-C.…”
Section: Compilersmentioning
confidence: 99%
“…Genesis [26] is a DL compiler that integrates graph partitioning functionalities into TVM. Genesis has a similar structure to NEST-C.…”
Section: Compilersmentioning
confidence: 99%
“…Once flexible matching completes, the extracted rewritten program is translated back to Relay where accelerator instructions are specially annotated. In our prototype, we use TVM's Bring Your Own Codegen (BYOC) interface to implement the generation of those accelerator instructions [16]. BYOC allows for invoking the target interface of a custom execution mechanism (e.g., an accelerator's MMIO loads/stores) by having TVM's runtime defer execution to a user-specified runtime when it reaches an annotated portion of the program.…”
Section: Prototype Implementationmentioning
confidence: 99%
“…In principle, many DSLs allow for supporting custom accelerators via bespoke translations from DSL operators to specific accelerator APIs, e.g., as in the original TVM [14] support for VTA [50]. TVM's BYOC [16] interface eases incorporating custom accelerators by performing syntactic pattern matching to offload computations via user-provided code generators. However, BYOC leaves all matters of code generation, e.g., MMIO invocations, to the user, while D2A provides more structure to code generation via the ILA.…”
Section: Pattern Matching Accelerator Callsmentioning
confidence: 99%
“…One naïve solution is to develop a full compiler stack from scratch for each hardware, but this does not scale. Bolt addresses this challenge by employing a BYOC (Bring Your Own Codegen) (Chen et al, 2021) approach. It enables us to reuse the existing compiler stacks (e.g., TVM) as much as possible and focus only on the optimization and code generation using templated device libraries.…”
Section: Challenges In Code Generationmentioning
confidence: 99%
“…Traditional BYOC systems (Chen et al, 2021) cannot target code generation in templated format; they treat such libraries as external functions at runtime. In contrast, Bolt produces low-level tensor implementations in the CUTLASS convention by instantiating the templates with the best parameters identified by the profiler.…”
Section: Templated Code Generationmentioning
confidence: 99%