We revisit the question of characterizing the convergence rate of plug-in estimators of optimal transport costs. It is well known that an empirical measure comprising independent samples from an absolutely continuous distribution on R d converges to that distribution at the rate n −1/d in Wasserstein distance, which can be used to prove that plug-in estimators of many optimal transport costs converge at this same rate. However, we show that when the cost is smooth, this analysis is loose: plug-in estimators based on empirical measures converge quadratically faster, at the rate n −2/d . As a corollary, we show that the Wasserstein distance between two distributions is significantly easier to estimate when the measures are far apart. We also prove lower bounds, showing not only that our analysis of the plug-in estimator is tight, but also that no other estimator can enjoy significantly faster rates of convergence uniformly over all pairs of measures. Our proofs rely on empirical process theory arguments based on tight control of L 2 covering numbers for locally Lipschitz and semi-concave functions. As a byproduct of our proofs, we derive L ∞ estimates on the displacement induced by the optimal coupling between any two measures satisfying suitable moment conditions, for a wide range of cost functions.