We present an efficient algorithm for the evaluation of the Caputo fractional derivative C 0 D α t f (t) of order α ∈ (0, 1), which can be expressed as a convolution of f ′ (t) with the kernel t −α . The algorithm is based on an efficient sum-of-exponentials approximation for the kernel t −1−α on the interval [∆t, T ] with a uniform absolute error ε, where the number of exponentials Nexp needed is ofAs compared with the direct method, the resulting algorithm reduces the storage requirement from O(N T ) to O(Nexp) and the overall computational cost from O(N 2 T ) to O(N T Nexp) with N T the total number of time steps. Furthermore, when the fast evaluation scheme of the Caputo derivative is applied to solve the fractional diffusion equations, the resulting algorithm requires only O(N S Nexp) storage and O(N S N T Nexp) work with N S the total number of points in space; whereas the direct methods require O(N S N T ) storage and O(N S N 2 T ) work. The complexity of both algorithms is nearly optimal since Nexp is of the order O(log N T ) for T ≫ 1 or O(log 2 N T ) for T ≈ 1 for fixed accuracy ε. We also present a detailed stability and error analysis of the new scheme for solving linear fractional diffusion equations. The performance of the new algorithm is illustrated via several numerical examples. Finally, the algorithm can be parallelized in a straightforward manner.