2018
DOI: 10.48550/arxiv.1802.04799
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

Abstract: There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms -such as mobile phones, embedded devices, and accelerators (e.g., FPGAs, ASICs) -requires significant manual effort. We propose TVM, a compiler that exposes graph-level and operator-level optimizations to provide performance portability to deep learning workloads acr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
68
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 52 publications
(68 citation statements)
references
References 39 publications
0
68
0
Order By: Relevance
“…PyExZ3 [31], PySym [25], flake8 [13], and Frosted [65] analyze Python source code and employ multiple heuristics to identify code issues statically [27]. XLA [64] and TVM [10] apply compiler techniques to optimize deep learning applications. Harp [74] detects inefficiencies in Tensorflow and PyTorch applications based on computation graphs.…”
Section: Existing Tools Vs Pieprofmentioning
confidence: 99%
“…PyExZ3 [31], PySym [25], flake8 [13], and Frosted [65] analyze Python source code and employ multiple heuristics to identify code issues statically [27]. XLA [64] and TVM [10] apply compiler techniques to optimize deep learning applications. Harp [74] detects inefficiencies in Tensorflow and PyTorch applications based on computation graphs.…”
Section: Existing Tools Vs Pieprofmentioning
confidence: 99%
“…MNN relies on a semi-automated search technique to generate the kernels from a pre-defined number of optimization strategies [25]. TVM takes it a step further and performs compilation and autotuning for each kernel [2]. In SparseDNN, we adopt the last approach.…”
Section: Deep Learning Inference Enginesmentioning
confidence: 99%
“…In addition, the user could generate dense kernels from deep learning compiler frameworks such as TVM or Triton to use as plugins instead of oneDNN kernels if they provide better performance [2,36]. We do not explore this option in this paper.…”
Section: Optimized Dense Kernelsmentioning
confidence: 99%
“…Low precision operators rely on efficient bitserial computation. We implement our operators using TVM, the deep learning compiler [3]. Our operators are designed to provide flexibility in precision and data layout, and performance portability across different CPU architectures.…”
Section: Low Precision Operatorsmentioning
confidence: 99%