SC14: International Conference for High Performance Computing, Networking, Storage and Analysis 2014
DOI: 10.1109/sc.2014.82
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
44
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 44 publications
(45 citation statements)
references
References 24 publications
1
44
0
Order By: Relevance
“…We can also note that on certain iterative approaches, the domain decomposition negatively impacts the convergence [21,22] regardless of whether it is implemented with processus or thread parallelism [21]. This is not the case in the matrix assembly part and therefore, the observed improvements are only related to our new parallelization strategy.…”
Section: Finite Element Methods Matrix Assemblymentioning
confidence: 86%
See 4 more Smart Citations
“…We can also note that on certain iterative approaches, the domain decomposition negatively impacts the convergence [21,22] regardless of whether it is implemented with processus or thread parallelism [21]. This is not the case in the matrix assembly part and therefore, the observed improvements are only related to our new parallelization strategy.…”
Section: Finite Element Methods Matrix Assemblymentioning
confidence: 86%
“…This leads to a better locality than the original ordering using the Cuthill-McKee approach [5]. We also observe that current coloring strategies [7,9,21] are not efficient on the very small data partition size of the fine grain task-based parallelism. We propose a new coloring heuristic to reveal data-parallelism in small partitions.…”
Section: Introductionmentioning
confidence: 85%
See 3 more Smart Citations