2019
DOI: 10.1137/18m1210691
|View full text |Cite
|
Sign up to set email alerts
|

Software for Sparse Tensor Decomposition on Emerging Computing Architectures

Abstract: In this paper, we develop software for decomposing sparse tensors that is portable to and performant on a variety of multicore, manycore, and GPU computing architectures. The result is a single code whose performance matches optimized architecture-specific implementations.The key to a portable approach is to determine multiple levels of parallelism that can be mapped in different ways to different architectures, and we explain how to do this for the matricized tensor times Khatri-Rao product (MTTKRP) which is … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 38 publications
(31 citation statements)
references
References 25 publications
(49 reference statements)
0
31
0
Order By: Relevance
“…In the literature, there are various CP-ALS implementations adopting different parallelism paradigms [13], [17], [28], [29], [30], [31], [32]. On distributed-memory systems, DMS [17] is the most commonly-used implementation.…”
Section: Related Workmentioning
confidence: 99%
“…In the literature, there are various CP-ALS implementations adopting different parallelism paradigms [13], [17], [28], [29], [30], [31], [32]. On distributed-memory systems, DMS [17] is the most commonly-used implementation.…”
Section: Related Workmentioning
confidence: 99%
“…Bader and Kolda [3] consider both dense and sparse Y tensors, showing that the cost is O(rn d ) for dense Y and O(r nnz(Y)) for sparse Y. Phan, Tichavsky, and Cichocki [39] propose methods to reuse partial computations when computing the MTTKRP for all d modes in sequence. Much recent work has focused on more efficient representations of sparse tensors and parallel MTTKRP computations [44,24,29,40]. There is also continued work on improving the efficiency of dense MTTKRP calculations [20,5].…”
Section: Tensor Notationmentioning
confidence: 99%
“…In terms of implementations, an interesting consequence of sampling in the context of parallel tensor decomposition [44,24,29,40] is that we can reduce the computation and/or communication by sampling only a subset of the entries. Moreover, we may be able to stratify the samples in such a way that is amenable to more structured communications.…”
Section: Samplingmentioning
confidence: 99%
“…Besides, their performance e ciency is low because of MATLAB environment. Recently, many other highly performance e cient libraries emerge, such as SPLATT [130], Cyclops Tensor Framework (CTF) [132], DFacTo [24], GigaTensor [65], HyperTensor [69], GenTen [110], to name a few. However, these libraries are speci c to one or two particular sparse tensor operations, this violates the application diversity requirement.…”
Section: Pasta In Needmentioning
confidence: 99%