Some Thermodynamic Relationships for Soils at or Below the Freezing Point: 2. Effects of Temperature and Pressure on Unfrozen Soil Water

We present converged, completely ab initio calculations of the triple differential cross sections for double photoionization of aligned H2 molecules for a photon energy of 75.0 eV. The method of exterior complex scaling, implemented with both the discrete variable representation and B-splines, is used to solve the Schrödinger equation for a correlated continuum wave function corresponding to a single photon having been absorbed by a correlated initial state. Results for a fixed internuclear distance are compared with recent experiments and show that integration over experimental angular and energy resolutions is necessary to produce good qualitative agreement, but does not eliminate some discrepancies. Limitations of current experimental resolution are shown to sometimes obscure interesting details of the cross section.

show abstract

Hiding Global Communication Latency in the GMRES Algorithm on Massively Parallel Machines

Ghysels¹,

Ashby²,

Meerbergen³

et al. 2013

SIAM J. Sci. Comput.

102

134

View full text Add to dashboard Cite

In the generalized minimal residual method (GMRES), the global all-to-all communication required in each iteration for orthogonalization and normalization of the Krylov base vectors is becoming a performance bottleneck on massively parallel machines. Long latencies, system noise, and load imbalance cause these global reductions to become very costly global synchronizations. In this work, we propose the use of nonblocking or asynchronous global reductions to hide these global communication latencies by overlapping them with other communications and calculations. A pipelined variation of GMRES is presented in which the result of a global reduction is used only one or more iterations after the communication phase has started. This way, global synchronization is relaxed and scalability is much improved at the expense of some extra computations. The numerical instabilities that inevitably arise due to the typical monomial basis by powering the matrix are reduced and often annihilated by using Newton or Chebyshev bases instead. Our parallel experiments on a medium-sized cluster show significant speedups of the pipelined solvers compared to standard GMRES. An analytical model is used to extrapolate the performance to future exascale systems.

show abstract

Complete Photo-Induced Breakup of the H ₂ Molecule as a Probe of Molecular Electron Correlation

Vanroose

Martı́n

Rescigno

et al. 2005

Science

120

105

View full text Add to dashboard Cite

Despite decades of progress in quantum mechanics, electron correlation effects are still only partially understood. Experiments in which both electrons are ejected from an oriented hydrogen molecule by absorption of a single photon have recently demonstrated a puzzling phenomenon: The ejection pattern of the electrons depends sensitively on the bond distance between the two nuclei as they vibrate in their ground state. Here, we report a complete numerical solution of the Schrödinger equation for the double photoionization of H 2 . The results suggest that the distribution of photoelectrons emitted from aligned molecules reflects electron correlation effects that are purely molecular in origin.

show abstract

Analyzing the Effect of Local Rounding Error Propagation on the Maximal Attainable Accuracy of the Pipelined Conjugate Gradient Method

Cools¹,

Yetkin²,

Agullo³

et al. 2018

SIAM J. Matrix Anal. & Appl.

View full text Add to dashboard Cite

Pipelined Krylov subspace methods typically offer improved strong scaling on parallel HPC hardware compared to standard Krylov subspace methods for large and sparse linear systems. In pipelined methods the traditional synchronization bottleneck is mitigated by overlapping time-consuming global communications with useful computations. However, to achieve this communication hiding strategy, pipelined methods introduce additional recurrence relations for a number of auxiliary variables that are required to update the approximate solution. This paper aims at studying the influence of local rounding errors that are introduced by the additional recurrences in the pipelined Conjugate Gradient method. Specifically, we analyze the impact of local round-off effects on the attainable accuracy of the pipelined CG algorithm and compare to the traditional CG method. Furthermore, we estimate the gap between the true residual and the recursively computed residual used in the algorithm. Based on this estimate we suggest an automated residual replacement strategy to reduce the loss of attainable accuracy on the final iterative solution. The resulting pipelined CG method with residual replacement improves the maximal attainable accuracy of pipelined CG, while maintaining the efficient parallel performance of the pipelined method. This conclusion is substantiated by numerical results for a variety of benchmark problems. date back to the late 1980's [44] and early 1990's [2,9,10,13,19]. The idea of reducing the number of global communication points in Krylov subspace methods on parallel computer architectures was also used in the s-step methods by Chronopoulos et al. [6,7,8] and more recently by Carson et al. in [3,4]. In addition to communication avoiding methods 1 , research on hiding global communication by overlapping communication with computations was performed by a various authors over the last decades, see Demmel et al. [13], De Sturler et al. [11], and Ghysels et al. [21,22]. We refer the reader to the recent work [5], Section 2 and the references therein for more background and a wider historical perspective on the development of early variants of the CG algorithm that contributed to the current algorithmic strive towards parallel efficiency.The pipelined CG (p-CG) method proposed in [22] aims at hiding the global synchronization latency of standard preconditioned CG by removing some of the global synchronization points. Pipelined CG performs only one global reduction per iteration. Furthermore, this global communication phase is overlapped by the sparse matrix-vector product (spmv), which requires only local communication. In this way, idle core time is minimized by performing useful computations simultaneously to the time-consuming global communication phase, cf. [18].The reorganization of the CG algorithm that is performed to achieve the overlap of communication with computations introduces several additional axpy (y ← αx+y) operations to recursively compute auxiliary variables. Vector operations such as an axpy are typi...

show abstract

On the indefinite Helmholtz equation: Complex stretched absorbing boundary layers, iterative analysis, and preconditioning

Reps

Vanroose

Zubair

2010

Journal of Computational Physics

View full text Add to dashboard Cite

This paper studies and analyzes a preconditioned Krylov solver for Helmholtz problems that are formulated with absorbing boundary layers based on complex coordinate stretching. The preconditioner problem is a Helmholtz problem where not only the coordinates in the absorbing layer have an imaginary part, but also the coordinates in the interior region. This results into a preconditioner problem that is invertible with a multigrid cycle. We give a numerical analysis based on the eigenvalues and evaluate the performance with several numerical experiments. The method is an alternative to the complex shifted Laplacian and it gives a comparable performance for the studied model problems.

show abstract

The communication-hiding pipelined BiCGstab method for the parallel solution of large unsymmetric linear systems

Cools

Vanroose

2017

Parallel Computing

View full text Add to dashboard Cite

By reducing the number of global synchronization bottlenecks per iteration and hiding communication behind useful computational work, pipelined Krylov subspace methods achieve significantly improved parallel scalability on present-day HPC hardware. However, this typically comes at the cost of a reduced maximal attainable accuracy. This paper presents and compares several stabilized versions of the communication-hiding pipelined Conjugate Gradients method. The main novel contribution of this work is the reformulation of the multi-term recurrence pipelined CG algorithm by introducing shifts in the recursions for specific auxiliary variables. These shifts reduce the amplification of local rounding errors on the residual. The stability analysis presented in this work provides a rigorous method for selection of the optimal shift value in practice. It is shown that, given a proper choice for the shift parameter, the resulting shifted pipelined CG algorithm restores the attainable accuracy and displays nearly identical robustness to local rounding error propagation compared to classical CG. Numerical results on a variety of SPD benchmark problems compare different stabilization techniques for the pipelined CG algorithm, showing that the shifted pipelined CG algorithm is able to attain a high accuracy while displaying excellent parallel performance.

show abstract

Local Fourier analysis of the complex shifted Laplacian preconditioner for Helmholtz problems

Cools

Vanroose

2013

Numerical Linear Algebra App

View full text Add to dashboard Cite

SUMMARYIn this paper, we solve the Helmholtz equation with multigrid preconditioned Krylov subspace methods. The class of shifted Laplacian preconditioners is known to significantly speed up Krylov convergence. However, these preconditioners have a parameter β MathClass-rel∈ double-struckR, a measure of the complex shift. Because of contradictory requirements for the multigrid and Krylov convergence, the choice of this shift parameter can be a bottleneck in applying the method. In this paper, we propose a wavenumber‐dependent minimal complex shift parameter, which is predicted by a rigorous k‐grid local Fourier analysis (LFA) of the multigrid scheme. We claim that, given any (regionally constant) wavenumber, this minimal complex shift parameter provides the reader with a parameter choice that leads to efficient Krylov convergence. Numerical experiments in one and two spatial dimensions validate the theoretical results. It appears that the proposed complex shift is both the minimal requirement for a multigrid V‐cycle to converge and being near optimal in terms of Krylov iteration count. Copyright © 2013 John Wiley & Sons, Ltd.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wim Vanroose

Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm

Double photoionization of aligned molecular hydrogen

Hiding Global Communication Latency in the GMRES Algorithm on Massively Parallel Machines

Complete Photo-Induced Breakup of the H ₂ Molecule as a Probe of Molecular Electron Correlation

Analyzing the Effect of Local Rounding Error Propagation on the Maximal Attainable Accuracy of the Pipelined Conjugate Gradient Method

On the indefinite Helmholtz equation: Complex stretched absorbing boundary layers, iterative analysis, and preconditioning

The communication-hiding pipelined BiCGstab method for the parallel solution of large unsymmetric linear systems

Local Fourier analysis of the complex shifted Laplacian preconditioner for Helmholtz problems

Contact Info

Product

Resources

About

Wim Vanroose

Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm

Double photoionization of aligned molecular hydrogen

Hiding Global Communication Latency in the GMRES Algorithm on Massively Parallel Machines

Complete Photo-Induced Breakup of the H 2 Molecule as a Probe of Molecular Electron Correlation

Analyzing the Effect of Local Rounding Error Propagation on the Maximal Attainable Accuracy of the Pipelined Conjugate Gradient Method

On the indefinite Helmholtz equation: Complex stretched absorbing boundary layers, iterative analysis, and preconditioning

The communication-hiding pipelined BiCGstab method for the parallel solution of large unsymmetric linear systems

Local Fourier analysis of the complex shifted Laplacian preconditioner for Helmholtz problems

Contact Info

Product

Resources

About

Complete Photo-Induced Breakup of the H ₂ Molecule as a Probe of Molecular Electron Correlation