2015
DOI: 10.1007/s11075-015-9974-9
|View full text |Cite
|
Sign up to set email alerts
|

A parallel fast boundary element method using cyclic graph decompositions

Abstract: We propose a method of a parallel distribution of densely populated matrices arising in boundary element discretizations of partial differential equations. In our method the underlying boundary element mesh consisting of n elements is decomposed into N submeshes. The related N ×N submatrices are assigned to N concurrent processes to be assembled. Additionally we require each process to hold exactly one diagonal submatrix, since its assembling is typically most time consuming when applying fast boundary element… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 19 publications
0
5
0
Order By: Relevance
“…Our work on hierarchical PCA was inspired by [4,5], where the authors used the method for clustering of 3D surface meshes into binary trees. The method is invariant with respect to rigid body modes, which may make it superior when compared, for instance, to oct-tree algorithms.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our work on hierarchical PCA was inspired by [4,5], where the authors used the method for clustering of 3D surface meshes into binary trees. The method is invariant with respect to rigid body modes, which may make it superior when compared, for instance, to oct-tree algorithms.…”
Section: Discussionmentioning
confidence: 99%
“…We refer to [3], which focuses on machine learning. The PCA also has good application for sparsification of densely populated matrices arising in boundary element methods [4,5].…”
Section: Related Workmentioning
confidence: 99%
“…Due to a limited scope of this paper, we refer the reader to [3,16] for more details. The parallelization of the method based on the cyclic graph decomposition was presented in [13] where only certain special numbers of processors were discussed. In [12] we further extended the approach to support general number of processors.…”
Section: Parallel Acamentioning
confidence: 99%
“…Afterwards, the P 2 blocks are assembled in parallel via P MPI processes. The workload distribution follows the so-called cyclic graph decomposition introduced in [13] for special values of P and later generalized in [12]. This way, each MPI process requires O(n/ √ P ) mesh data for the assembly of the blocks and actively works with O(n/ √ P ) degrees of freedom during matrix-vector multiplication, which has a positive effect on the efficiency of the required MPI synchronization phase.…”
Section: Parallel Acamentioning
confidence: 99%
See 1 more Smart Citation