2019
DOI: 10.48550/arxiv.1909.01149
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PLANC: Parallel Low Rank Approximation with Non-negativity Constraints

Srinivas Eswar,
Koby Hayashi,
Grey Ballard
et al.

Abstract: We consider the problem of low-rank approximation of massive dense non-negative tensor data, for example to discover latent patterns in video and imaging applications. As the size of data sets grows, single workstations are hitting bottlenecks in both computation time and available memory. We propose a distributed-memory parallel computing solution to handle massive data sets, loading the input data across the memories of multiple nodes and performing efficient and scalable parallel algorithms to compute the l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
6
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(6 citation statements)
references
References 43 publications
0
6
0
Order By: Relevance
“…First, we propose the multi-sweep dimension tree (MSDT) algorithm, which requires the TTM between an order-N input tensor with dimension size s and the first-contracted input matrix once every N −1 N sweeps and reduce the leading persweep computational cost of a rank-R CP-ALS to 2 N N −1 s N R. This algorithm can produce exactly the same results as the standard dimension tree, i.e., it has no accuracy loss. Leveraging a parallelization strategy similar to previous work [3], [10] that performs the dimension tree calculations locally, our benchmark results show a speed-up of 1.25X compared to the state-of-art dimension tree running on 1024 processors.…”
Section: Introductionmentioning
confidence: 75%
See 4 more Smart Citations
“…First, we propose the multi-sweep dimension tree (MSDT) algorithm, which requires the TTM between an order-N input tensor with dimension size s and the first-contracted input matrix once every N −1 N sweeps and reduce the leading persweep computational cost of a rank-R CP-ALS to 2 N N −1 s N R. This algorithm can produce exactly the same results as the standard dimension tree, i.e., it has no accuracy loss. Leveraging a parallelization strategy similar to previous work [3], [10] that performs the dimension tree calculations locally, our benchmark results show a speed-up of 1.25X compared to the state-of-art dimension tree running on 1024 processors.…”
Section: Introductionmentioning
confidence: 75%
“…Our parallel algorithms for CP-ALS on dense tensors are based on Algorithm 3, which is introduced in [3], [10]. The input tensor T T T with order N is uniformly distributed across an order N processor grid P, and all the factor matrices are initially distributed such that each processor owns a subset of the rows.…”
Section: E Parallel Cp-alsmentioning
confidence: 99%
See 3 more Smart Citations