2019
DOI: 10.48550/arxiv.1910.00078
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

MIOpen: An Open Source Library For Deep Learning Primitives

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…These APIs are backed by reference implementations that enable Flashlight to efficiently target CPUs, GPUs, and other accelerators. These include code generation and dedicated kernels for Intel, AMD, OpenCL, and CUDA devices, and leverage libraries such as cuDNN [Chetlur et al, 2014], MKL [Intel, 2020a], oneDNN [Intel, 2020b], ArrayFire [Malcolm et al, 2012], and MiOpen [Khan et al, 2019].…”
Section: Open Foundational Interfacesmentioning
confidence: 99%
“…These APIs are backed by reference implementations that enable Flashlight to efficiently target CPUs, GPUs, and other accelerators. These include code generation and dedicated kernels for Intel, AMD, OpenCL, and CUDA devices, and leverage libraries such as cuDNN [Chetlur et al, 2014], MKL [Intel, 2020a], oneDNN [Intel, 2020b], ArrayFire [Malcolm et al, 2012], and MiOpen [Khan et al, 2019].…”
Section: Open Foundational Interfacesmentioning
confidence: 99%
“…This paper target to catch up with NVIDIA's performance [16]. MIOpen is a deep learning accelerating library implemented for Radeon graphics by AMD [18]. At present, the implementation of each algorithm of the library is not perfect, and the performance fails to meet expectations, but that has certain guiding significance for the implementation of GPU in Winograd algorithm.…”
Section: Introductionmentioning
confidence: 99%
“…AMD followed a technical route, naming their analogous component the Matrix Core (MC) in their MI100 series data center GPU. This algorithm-architecture co-design marked a huge success with the fact that mainstream deep learning frameworks like PyTorch have embraced these designs with the help of vendor-provided high-performance libraries like cuDNN, cuBLAS and MIOpen [8]. Other vendors like Google and Tesla have also presented proprietary ASIC accelerators like TPU [9] and Dojo [10], aiming to accelerate the quantized workloads by exploiting special hardware components to calculate low-precision types of data elements.…”
Section: Introductionmentioning
confidence: 99%