Abstract. Classical solvers for the dense symmetric eigenvalue problem suffer from the first step, which involves a reduction to tridiagonal form that is dominated by the cost of accessing memory during the panel factorization. The solution is to reduce the matrix to a banded form, which then requires the eigenvalues of the banded matrix to be computed. The standard divide and conquer algorithm can be modified for this purpose. The paper combines this insight with tile algorithms that can be scheduled via a dynamic runtime system to multicore architectures. A detailed analysis of performance and accuracy is included. Performance improvements of 14-fold and 4-fold speedups are reported relative to LAPACK and Intel's Math Kernel Library.Key words. divide and conquer, symmetric eigenvalue solver, tile algorithms, dynamic scheduling AMS subject classifications. 15A18, 65F15, 65F18, 65Y05, 65Y20, 68W10 DOI. 10.1137/1108236991. Introduction. The objective of this paper is to introduce a new high performance tile divide and conquer (TD&C) eigenvalue solver for dense symmetric matrices on homogeneous multicore architectures. The necessity of calculating eigenvalues emerges from various computational science disciplines, e.g., in quantum physics [33], chemistry [37], and mechanics [25], as well as in statistics when computing the principal component analysis of the symmetric covariance matrix. As multicore systems continue to gain ground in the high performance computing community, linear algebra algorithms have to be redesigned or new algorithms have to be developed in order to take advantage of the architectural features brought by these processing units.In particular, tile algorithms have recently shown very promising performance results for solving linear systems of equations on multicore architectures using Cholesky, QR/LQ, and LU factorizations available in the PLASMA [34] library and other similar projects like FLAME [44]. The PLASMA concepts consist of splitting the input matrix into square tiles and reorganizing the data within each tile to be contiguous in memory (block data layout) for efficient cache reuse. The whole dataflow execution can then be represented as a directed acyclic graph (DAG) where nodes are tasks operating on tiles, and edges represent dependencies between them. An efficient and lightweight runtime system environment named QUARK [30] (internally