2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2016
DOI: 10.1109/ipdps.2016.113
|View full text |Cite
|
Sign up to set email alerts
|

A Medium-Grained Algorithm for Sparse Tensor Factorization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
62
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 49 publications
(62 citation statements)
references
References 18 publications
0
62
0
Order By: Relevance
“…In particular, Liavas et al [6] extend a parallel algorithm designed for sparse tensors [25] to the 3D dense case. They use the "medium-grained" dense tensor distribution and rowwise factor matrix distribution, which is exactly the same as our distribution strategy (see section IV-C2), and they use a Nesterov-based algorithm to enforce the nonnegativity constraints.…”
Section: Related Workmentioning
confidence: 99%
“…In particular, Liavas et al [6] extend a parallel algorithm designed for sparse tensors [25] to the 3D dense case. They use the "medium-grained" dense tensor distribution and rowwise factor matrix distribution, which is exactly the same as our distribution strategy (see section IV-C2), and they use a Nesterov-based algorithm to enforce the nonnegativity constraints.…”
Section: Related Workmentioning
confidence: 99%
“…Prior work has also studied the Tucker decomposition on the MapReduce platform [11]. Other tensor decompositions such as CP factorization have been explored as well (e.g., [13,16,12,25,14]).…”
Section: Procedurementioning
confidence: 99%
“…In this section, we discuss prior schemes proposed in the context of Tucker decomposition [15], as well the related CP decomposition [25]. The schemes can be categorized in to three types.…”
Section: Prior Distribution Schemesmentioning
confidence: 99%
See 1 more Smart Citation
“…In regards to tensor factorization, designing high performance implementations for CP-ALS, as well as measuring their performance, is an active area of research [22]. There have been efforts to perform tensor factorization on both shared and distributed memory systems [23], [24], [25], as well as on GPUs [26], [27]. However, to the best of our knowledge, ReFacTo is the only current implementation of CP-ALS that runs on multiple GPUs in a distributed fashion and is able to utilize GPU communication hardware and software.…”
Section: Related Workmentioning
confidence: 99%