As
computer systems dedicated to scientific calculations become
massively parallel, the poor parallel performance of the Fock matrix
diagonalization becomes a major impediment to achieving larger molecular
sizes in self-consistent field (SCF) calculations. In this Article,
a novel, highly parallel, and diagonalization-free algorithm for the
accelerated convergence of the SCF procedure is presented. The algorithm,
called Q-Next, draws on the second-order SCF, quadratically convergent
SCF, and direct inversion of the iterative subspace (DIIS) approaches
to enable fast convergence while replacing the Fock matrix diagonalization
SCF bottleneck with higher parallel efficiency matrix multiplications.
Performance results on both parallel multicore CPU and GPU hardware
for a variety of test molecules and basis sets are presented, showing
that Q-Next achieves a convergence rate comparable to the DIIS method
while being, on average, one order of magnitude faster.