An optimized parallel algorithm is proposed to solve the problem occurred in the process of complicated backward substitution of cyclic reduction during solving tridiagonal linear systems. Adopting a hybrid parallel model, this algorithm combines the cyclic reduction method and the partition method. This hybrid algorithm has simple backward substitution on parallel computers comparing with the cyclic reduction method. In this paper, the operation count and execution time are obtained to evaluate and make comparison for these methods. On the basis of results of these measured parameters, the hybrid algorithm using the hybrid approach with a multi-threading implementation achieves better efficiency than the other parallel methods, that is, the cyclic reduction and the partition methods. In particular, the approach involved in this paper has the least scalar operation count and the shortest execution time on a multi-core computer when the size of equations meets some dimension threshold. The hybrid parallel algorithm improves the performance of the cyclic reduction and partition methods by 19.2% and 13.2%, respectively. In addition, by comparing the single-iteration and multi-iteration hybrid parallel algorithms, it is found that increasing iteration steps of the cyclic reduction method does not affect the performance of the hybrid parallel algorithm very much.
ITERATION-BASED HYBRID PARALLEL ALGORITHM
5077In the past, several direct and iterative methods have been proposed for this problem. The iterative methods [3,4] are mainly Jacobi's method, Gauss-Seidel, and successive overrelaxation. The direct methods include Thomas algorithm [5] and several parallel methods. Although the Thomas algorithm is the fastest algorithm on a serial computer, it is not directly and completely parallelizable because each step of this algorithm depends on the preceding one [6]. The present paper will describe several parallel tridiagonal solvers and present a new hybrid parallel algorithm for the tridiagonal systems.
Related researchThe KLU (Clark Kent LU) algorithm, one method in existing numerical libraries, was proposed by Ekanathan Palamadai [7]. KLU is a sparse high-performance linear solver that employs hybrid ordering mechanisms and elegant factorization and solve algorithms. It achieves high quality fill-in rate and beats many existing solvers in run time, when used for matrices arising in circuit simulation. But KLU is a fast serial algorithm.Research into parallelization strategies of tridiagonal solvers continues to be an active area of exploration. The Thomas algorithm, a specific Gaussian elimination method, is one of the first algorithms considered for tridiagonal linear system. This algorithm can be completed in two distinct steps, forward elimination and backward substitution. The first step consists of 'removing the coefficients under the diagonal, and then this leaves the new system with the coefficient matrix of two diagonals. The second step is to exploit backward sweep eliminating for the new system in order to ca...