Consider an instance of Euclidean k-means or k-medians clustering. We show that the cost of the optimal solution is preserved up to a factor of (1 + ε) under a projection onto a random O(log(k/ε)/ε 2 )-dimensional subspace. Further, the cost of every clustering is preserved within (1 + ε). More generally, our result applies to any dimension reduction map satisfying a mild sub-Gaussian-tail condition. Our bound on the dimension is nearly optimal. Additionally, our result applies to Euclidean k-clustering with the distances raised to the p-th power for any constant p.For k-means, our result resolves an open problem posed by Cohen, Elder, Musco, Musco, and Persu (STOC 2015); for k-medians, it answers a question raised by Kannan. * Supported by NSF CCF-1718820 and NSF Career CCF-1150062.Dimension reduction. The cornerstone dimension reduction statement for the Euclidean distance is the Johnson-Lindenstrauss Lemma [JL84]. For positive reals p, q, ε, we write p ≈ 1+ε q if 1 1+ε · p ≤ q ≤ (1 + ε) · p.Theorem 1.1 ([JL84]). There exists a family of random linear maps π m,d : R m → R d with the following properties. For every m ≥ 1, ε, δ ∈ (0; 1/2) and all x ∈ R m , we haveA straightforward corollary is that one is able to embed any n-point subset of a Euclidean space into an O log n ε 2 -dimensional space, while preserving all of the pairwise distances up to (1 + ε). This bound is known to be tight [Alo03, LN17]. The attractive feature of the dimension reduction procedure given by Theorem 1.1 is that it is data-oblivious i.e., the distribution over linear maps is independent of the set of points we apply it to.There are several constructions of families of random maps π m,d that satisfy Theorem 1.1: projections on a random subspace [JL84, DG03] and maps given by matrices with i.i.d. Gaussian and sub-Gaussian entries [IM98, Ach03, KM + 05]. All of these constructions satisfy a certain additional condition, which we will need later. Definition 1.2. A family of random linear maps π m,d : R m → R d is called sub-Gaussian-tailed if for every unit vector x ∈ R m and every t > 0, one has:Pr π∼π m,d