Nils-Arne Dreier scite author profile

C++ advocates exceptions as the preferred way to handle unexpected behaviour of an implementation in the code. This does not integrate well with the error handling of MPI, which more or less always results in program termination in case of MPI failures. In particular, a local C++ exception can currently lead to a deadlock due to unfinished communication requests on remote hosts. At the same time, future MPI implementations are expected to include an API to continue computations even after a hard fault (node loss), i.e. the worst possible unexpected behaviour.In this paper we present an approach that adds extended exception propagation support to C++ MPI programs. Our technique allows to propagate local exceptions to remote hosts to avoid deadlocks, and to map MPI failures on remote hosts to local exceptions. A use case of particular interest are asynchronous 'local failure local recovery' resilience approaches. Our prototype implementation uses MPI-3.0 features only. In addition we present a dedicated implementation, which integrates seamlessly with MPI-ULFM, i.e. the most prominent proposal for extending MPI towards fault tolerance.Our implementation is available at https://gitlab.dune-project.org/christi/test-mpi-exceptions.

show abstract

Towards Local-Failure Local-Recovery in PDE Frameworks: The Case of Linear Solvers

Altenbernd

Dreier

Engwer

et al. 2021

View full text Add to dashboard Cite

Strategies for the Vectorized Block Conjugate Gradients Method

Dreier

Engwer

2020

View full text Add to dashboard Cite

A high-level C++ approach to manage local errors, asynchrony and faults in an MPI application

Engwer¹,

Altenbernd²,

Dreier³

et al. 2018

Preprint

View full text Add to dashboard Cite

The DUNE Framework: Basic Concepts and Recent Developments

Bastian¹,

Blatt²,

Dedner³

et al. 2019

Preprint

View full text Add to dashboard Cite

This paper presents the basic concepts and the module structure of the Distributed and Unified Numerics Environment and reflects on recent developments and general changes that happened since the release of the first Dune version in 2007 and the main papers describing that state [1,2]. This discussion is accompanied with a description of various advanced features, such as coupling of domains and cut cells, grid modifications such as adaptation and moving domains, high order discretizations and node level performance, non-smooth multigrid methods, and multiscale methods. A brief discussion on current and future development directions of the framework concludes the paper.

show abstract

Strategies for the vectorized Block Conjugate Gradients method

Dreier¹,

Engwer²

2019

Preprint

View full text Add to dashboard Cite

Block Krylov methods have recently gained a lot of attraction. Due to their increased arithmetic intensity they offer a promising way to improve performance on modern hardware. Recently Frommer et al. presented a block Krylov framework that combines the advantages of block Krylov methods and data parallel methods. We review this framework and apply it on the Block Conjugate Gradients method, to solve linear systems with multiple right hand sides. In this course we consider challenges that occur on modern hardware, like a limited memory bandwidth, the use of SIMD instructions and the communication overhead. We present a performance model to predict the efficiency of different Block CG variants and compare these with experimental numerical results.

show abstract

Exa-Dune -- Flexible PDE Solvers, Numerical Methods and Applications

Bastian¹,

Altenbernd²,

Dreier³

et al. 2019

Preprint

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Nils-Arne Dreier

The Dune framework: Basic concepts and recent developments

A High-Level C++ Approach to Manage Local Errors, Asynchrony and Faults in an MPI Application

Towards Local-Failure Local-Recovery in PDE Frameworks: The Case of Linear Solvers

Strategies for the Vectorized Block Conjugate Gradients Method

A high-level C++ approach to manage local errors, asynchrony and faults in an MPI application

The DUNE Framework: Basic Concepts and Recent Developments

Strategies for the vectorized Block Conjugate Gradients method

Exa-Dune -- Flexible PDE Solvers, Numerical Methods and Applications

Contact Info

Product

Resources

About