Jiajia Li scite author profile

Sparse matricized tensor times Khatri-Rao product (MTTKRP) is one of the most computationally expensive kernels in sparse tensor computations. This work focuses on optimizing the MTTKRP operation on GPUs, addressing both performance and storage requirements. We begin by identifying the performance bottlenecks in directly extending the state-ofthe-art CSF (compressed sparse fiber) format from CPUs to GPUs. A significant challenge with GPUs compared to multicore CPUs is that of utilizing the much greater degree of parallelism in a load-balanced fashion for irregular computations like sparse MTTKRP. To address this issue, we develop a new storage-efficient representation for tensors that enables highperformance, load-balanced execution of MTTKRP on GPUs. A GPU implementation of sparse MTTKRP using the new sparse tensor representation is shown to outperform all currently known parallel sparse CPU and GPU MTTKRP implementations.

show abstract

Optimizing Sparse Tensor Times Matrix on Multi-core and Many-Core Architectures

Li¹,

Ma²,

Yan³

et al. 2016

View full text Add to dashboard Cite

A pattern based algorithmic autotuner for graph processing on GPUs

Meng

Tan

et al. 2019

View full text Add to dashboard Cite

Bridging the gap between deep learning and sparse matrix format selection

ZhaoYue

LiaoChunhua

et al. 2018

SIGPLAN Not.

View full text Add to dashboard Cite

This work presents a systematic exploration on the promise and special challenges of deep learning for sparse matrix format selection---a problem of determining the best storage format for a matrix to maximize the performance of Sparse Matrix Vector Multiplication (SpMV). It describes how to effectively bridge the gap between deep learning and the special needs of the pillar HPC problem through a set of techniques on matrix representations, deep learning structure, and cross-architecture model migrations. The new solution cuts format selection errors by two thirds, and improves SpMV performance by 1.73X on average over the state of the art.

show abstract

An Initial Characterization of the Emu Chick

Hein

Conte

Young

et al. 2018

View full text Add to dashboard Cite

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Jiajia Li

HiCOO: Hierarchical Storage of Sparse Tensors

Model-Driven Sparse CP Decomposition for Higher-Order Tensors

An efficient mixed-mode representation of sparse tensors

Load-Balanced Sparse MTTKRP on GPUs

Optimizing Sparse Tensor Times Matrix on Multi-core and Many-Core Architectures

A pattern based algorithmic autotuner for graph processing on GPUs

Bridging the gap between deep learning and sparse matrix format selection

An Initial Characterization of the Emu Chick

Contact Info

Product

Resources

About