Unfitted finite element techniques are valuable tools in different applications where the generation of body-fitted meshes is difficult. However, these techniques are prone to severe ill conditioning problems that obstruct the efficient use of iterative Krylov methods and, in consequence, hinders the practical usage of unfitted methods for realistic large scale applications. In this work, we present a technique that addresses such conditioning problems by constructing enhanced finite element spaces based on a cell aggregation technique. The presented method, called aggregated unfitted finite element method, is easy to implement, and can be used, in contrast to previous works, in Galerkin approximations of coercive problems with conforming Lagrangian finite element spaces. The mathematical analysis of the new method states that the condition number of the resulting linear system matrix scales as in standard finite elements for body-fitted meshes, without being affected by small cut cells, and that the method leads to the optimal finite element convergence order. These theoretical results are confirmed with 2D and 3D numerical experiments.
FEMPAR is an open source object oriented Fortran200X scientific software library for the high-performance scalable simulation of complex multiphysics problems governed by partial differential equations at large scales, by exploiting state-of-the-art supercomputing resources. It is a highly modularized, flexible, and extensible library, that provides a set of modules that can be combined to carry out the different steps of the simulation pipeline. FEMPAR includes a rich set of algorithms for the discretization step, namely (arbitrary-order) grad, div, and curl-conforming finite element methods, discontinuous Galerkin methods, B-splines, and unfitted finite element techniques on cut cells, combined with h-adaptivity. The linear solver module relies on state-of-the-art bulk-asynchronous implementations of multilevel domain decomposition solvers for the different discretization alternatives and block-preconditioning techniques for multiphysics problems. FEMPAR is a framework that provides users with out-of-the-box state-of-the-art discretization techniques and highly scalable solvers for the simulation of complex applications, hiding the dramatic complexity of the underlying algorithms. But it is also a framework for researchers that want to experience with new algorithms and solvers, by providing a highly extensible framework. In this work, the first one in a series of articles about FEMPAR, we provide a detailed introduction to the software abstractions used in the discretization module and the related geometrical module. We also provide some ingredients about the assembly of linear systems arising from finite element discretizations, but the software design of complex scalable multilevel solvers is postponed to a subsequent work.
In this work we propose a novel parallelization approach of two-level balancing domain decomposition by constraints preconditioning based on overlapping of fine-grid and coarsegrid duties in time. The global set of MPI tasks is split into those that have fine-grid duties and those that have coarse-grid duties, and the different computations and communications in the algorithm are then rescheduled and mapped in such a way that the maximum degree of overlapping is achieved while preserving data dependencies among them. In many ranges of interest, the extra cost associated to the coarse-grid problem can be fully masked by fine-grid related computations (which are embarrassingly parallel). Apart from discussing code implementation details, the paper also presents a comprehensive set of numerical experiments that includes weak scalability analyses with structured and unstructured meshes for the three-dimensional Poisson and linear elasticity problems on a pair of state-of-the-art multicore-based distributed-memory machines. This experimental study reveals remarkable weak scalability in the solution of problems with thousands of millions of unknowns on several tens of thousands of computational cores. Introduction.Scientific phenomena governed by partial differential equations (PDEs) can be approximated by finite element (FE) methods, which results in a sparse linear system of equations to be solved via numerical linear algebra. The everincreasing demand of reality in the simulation of the complex scientific and engineering three-dimensional (3D) problems faced nowadays involves the solution of very large sparse linear systems with several hundreds and even thousands of millions of equations/unknowns. The solution of these systems in a moderate time requires the vast amount of computational resources provided by current multicore-based distributedmemory machines. It is therefore essential to design parallel algorithms able to efficiently exploit their underlying architecture.Domain decomposition (DD) preconditioning of Krylov iterative solvers provides a natural framework for the development of fast parallel solvers tailored for distributedmemory machines, as it has by construction the desirable design principle of maximizing local computations while minimizing interprocessor communication. One-level DD preconditioners, such as the Neumann-Neumann preconditioner [29], are highly
Abstract. In this paper we present a fully-distributed, communicator-aware, recursive, and interlevel-overlapped message-passing implementation of the multilevel balancing domain decomposition by constraints (MLBDDC) preconditioner. The implementation highly relies on subcommunicators in order to achieve the desired effect of coarse-grain overlapping of computation and communication, and communication and communication among levels in the hierarchy (namely inter-level overlapping). Essentially, the main communicator is split into as many non-overlapping subsets of MPI tasks (i.e., MPI subcommunicators) as levels in the hierarchy. Provided that specialized resources (cores and memory) are devoted to each level, a careful re-scheduling and mapping of all the computations and communications in the algorithm lets a high degree of overlapping to be exploited among levels. All subroutines and associated data structures are expressed recursively, and therefore MLBDDC preconditioners with an arbitrary number of levels can be built while re-using significant and recurrent parts of the codes. This approach leads to excellent weak scalability results as soon as level-1 tasks can mask coarser-levels duties. We provide a model to indicate how to choose the number of levels and coarsening ratios between consecutive levels and determine qualitatively the scalability limits for a given choice. We have carried out a comprehensive weak scalability analysis of the proposed implementation for the 3D Laplacian and linear elasticity problems. Excellent weak scalability results have been obtained up to 458,752 IBM BG/Q cores and 1.8 million MPI tasks, being the first time that exact domain decomposition preconditioners (only based on sparse direct solvers) reach these scales.1. Introduction. The simulation of scientific and engineering problems governed by partial differential equations (PDEs) involves the solution of sparse linear systems. The time spent in an implicit simulation at the linear solver relative to the overall execution time grows with the size of the problem and the number of cores [22]. In order to satisfy the ever increasing demand of reality and complexity in the simulations, scientific computing must advance in the development of numerical algorithms and implementations that will efficiently exploit the largest amounts of computational resources, and a massively parallel linear solver is a key component in this process.The growth in computational power passes now through increasing the number of cores in a chip, instead of making cores faster. The next generation of supercomputers, able to reach 1 exaflop/s, is expected to reach billions of cores. Thus, the future of scientific computing will be strongly related to the ability to efficiently exploit these extreme core counts [1].Only numerical algorithms with all their components scalable will efficiently run on extreme scale supercomputers. On extreme core counts, it will be a must to reduce communication and synchronization among cores, and overlap communication ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.