Performance comparison of data-reordering algorithms for sparse matrix–vector multiplication in edge-based unstructured grid computations

Numerical Methods in Fluids

Coutinho

2007

Self Cite

100

SUMMARYFree-surface flows occur in several problems in hydrodynamics, such as fuel or water sloshing in tanks, waves breaking in ships, offshore platforms, harbours and coastal areas. The computation of such highly nonlinear flows is challenging since free-surfaces commonly present merging, fragmentation and breaking parts, leading to the use of interface-capturing Eulerian approaches. In such methods the surface between two fluids is captured by the use of a marking function which is transported in a flow field. In this work we present a three-dimensional parallel edge-based incompressible SUPG/PSPG finite element method to cope with free-surface problems with volume-of-fluid (VOF) extensions to track the evolving free surface. The pure advection equation for the scalar marking function was solved by a fully implicit parallel edge-based SUPG finite element formulation. We studied variants of this formulation, considering the effects of discontinuity capturing and a particular tangent transformation designed to increase interface sharpness. Global mass conservation is enforced adding or removing mass proportionally to the absolute value of the normal velocity of the interface. We introduce a parallel dynamic deactivation algorithm to solve the marking function equation only in a small region around the interface. The implementation is targeted to distributed memory systems with cache-based processors. The performance and accuracy of the proposed solution method were tested with several validation problems.

Section: Solution Proceduresmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Stabilized edge‐based finite element simulation of free‐surface flows

Numerical Methods in Fluids

Coutinho

2007

Self Cite

100

“…The computations are performed in parallel using a distributed memory paradigm through the message passing interface library. The parallel partitions are generated by the Metis library [44], whereas the information regarding the edges of the computational grid is obtained from the EdgePack library as described in [45]. EdgePack also reorders nodes, edges and elements to improve data locality, exploiting efficiently the memory hierarchy of current processors.…”

Section: Solution Proceduresmentioning

confidence: 99%

Stabilized edge‐based finite element computation of gravity currents in lock‐exchange configurations

Numerical Methods in Fluids

Paraizo

Coutinho

2008

Self Cite

SUMMARYModeling of gravity current flows is important in many problems of science and engineering. Gravity currents are primarily horizontal flows driven by a density difference of few per cents. This phenomenon occurs in many scales in nature, such as ocean and marine flows, sea breeze formation, avalanches, turbidite flows, etc. Most of the gravity current simulations employ structured grid or spectral methods. In this work, we simulate gravity-driven flows by a parallel stabilized edge-based finite element code with particular emphasis on the simulation of the lock-exchange problem for planar and cylindrical configurations. Our results are validated against other highly resolved numerical simulations and experiments.

“…The cache sharing scheme in older Intel Xeon processors, although behaving as Quad-core are, in fact, two Dual-core processors put together. Mesh entities are ordered to improve data locality as described in [1].…”

Section: Performance Testsmentioning

confidence: 99%

“…It is important to remember that the main EdgeCFD's kernels (matrix-vector product, stiffness matrix build up and assembly of elements residua) strongly relies in indirect memory addressing operations and are, thus, influenced by how mesh entities are accessed and used during these operations. In EdgeCFD, mesh entities are reordered to makes efficient use of cache memory as explained in details in [1]. However, due to the complexity of the main loops of the software, cache misses are expected even for reordered meshes.…”

Section: Fig 2 Speedup For Two Xeon Systems Running Up To 8 Intra-nmentioning

confidence: 99%

Evaluation of Message Passing Communication Patterns in Finite Element Solution of Coupled Problems

Lecture Notes in Computer Science

Camata

Aveleda

et al. 2011

Self Cite

Abstract. This work presents a performance evaluation of single node and subdomain communication schemes available in EdgeCFD, an implicit edgebased coupled fluid flow and transport code for solving large scale problems in modern clusters. A natural convection flow problem is considered to assess performance metrics. Tests, focused in single node multi-core performance, show that past Intel Xeon processors dramatically suffer when large workloads are imposed to a single node. However, the problem seems to be mitigated in the newest Intel Xeon processor. We also observe that MPI non-blocking pointto-point interface sub-domain communications, although more difficult to implement, are more effective than collective interface sub-domain communications.