“…Previously, FPGAs were employed to demonstrate the highly parallel implementations of EVD and SVD based on two-sided Jacobi Rotations, by accelerating their independent 2 × 2 rotations, using a parallel architecture featuring a 2-dimensional systolic array. In this earlier work, the scalability of the applicable matrices had been severely restricted by the limited resources on FPGAs [Brent and Luk (1982); Brent et al (1985); Ahmedsaid et al (2003); Ma et al (2006)]. In [Brent and Luk (1982); Brent et al (1985)], the authors demonstrated the efficiency of the 2D systolic array designs for EVD and SVD with the time complexity of O(n log n) for an n-by-n square matrix, in which log n was proved as the number of iterations for reasonable convergence with certain threshold by applying parallel Jacobi rotation or cyclic Jacobi rotation methods; meanwhile, a number of n 2 processing units (PEs) are needed.…”