2012 19th International Conference on High Performance Computing 2012
DOI: 10.1109/hipc.2012.6507483
|View full text |Cite
|
Sign up to set email alerts
|

Sparse matrix-matrix multiplication on modern architectures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
20
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 40 publications
(20 citation statements)
references
References 23 publications
0
20
0
Order By: Relevance
“…Furthermore, MAGMA employs task-based work distribution, in contrast to the symmetric data-parallel approach used in this work. Matam et al [17] have implemented a hybrid CPU/GPU solver for sparse matrix-multiplication. However, they do not scale their solution beyond a single node.…”
Section: A Related Workmentioning
confidence: 99%
“…Furthermore, MAGMA employs task-based work distribution, in contrast to the symmetric data-parallel approach used in this work. Matam et al [17] have implemented a hybrid CPU/GPU solver for sparse matrix-multiplication. However, they do not scale their solution beyond a single node.…”
Section: A Related Workmentioning
confidence: 99%
“…The peak multiplication performance is 16GFlops/s, and the overall peak performance (multiplication+addition) is 32GFlops/s. The roof in our condition is 23 over our performance, and 9.6× over OuterSPACE. We are much nearer to the roof comparing to OuterSPACE.…”
Section: B Experimental Resultsmentioning
confidence: 80%
“…It achieves good output reuse with pipelined multiply and merge, matrix condensing, Huffman tree scheduler, good input reuse with row prefetcher. memory access pattern and poor locality caused by lowdensity matrices [21], [22], [23]. For instance, the density of Twitter's [24] adjacency matrix is as low as 0.000214%.…”
Section: Introductionmentioning
confidence: 99%
“…Regular matrices result from problems involving mesh approximations, e.g., from finite element methods, while irregular matrices mostly result from network structures. These matrices were also used for performance tests by [32] and therefore provide a basis for comparison. The matrix mouse280 originates from a finite difference mesh (using a seven-point stencil) which models the diffuse light propagation inside a mouse [21].…”
Section: Performance Measurementsmentioning
confidence: 99%
“…Considering the same example application as in [12,31,32], we measured the time to compute the square of a sparse matrix C = AA. To assess the scalability, the performance was measured as a function of the matrix width for three-dimensional Poisson matrices ( Figure 5).…”
Section: Matrix Squaring: Performance Comparisonmentioning
confidence: 99%