We consider the initial/boundary value problem for a diffusion equation involving multiple time-fractional derivatives on a bounded convex polyhedral domain. We analyze a space semidiscrete scheme based on the standard Galerkin finite element method using continuous piecewise linear functions. Nearly optimal error estimates for both cases of initial data and inhomogeneous term are derived, which cover both smooth and nonsmooth data. Further we develop a fully discrete scheme based on a finite difference discretization of the time-fractional derivatives, and discuss its stability and error estimate. Extensive numerical experiments for one and two-dimension problems confirm the convergence rates of the theoretical results.2 BANGTI JIN, RAYTCHO LAZAROV, YIKAN LIU, AND ZHI ZHOU space and slow asymptotic decay in time [30], which in turn also impacts related numerical analysis [12] and inverse problems [14,30].The model (1.1) was developed to improve the modeling accuracy of the single-term model (1.3) for describing anomalous diffusion. For example, in [31], a two-term fractionalorder diffusion model was proposed for the total concentration in solute transport, in order to distinguish explicitly the mobile and immobile status of the solute using fractional dynamics. The kinetic equation with two fractional derivatives of different orders appears also quite naturally when describing subdiffusive motion in velocity fields [26]; see also [16] for discussions on the model for wave-type phenomena.There are very few mathematical studies on the model (1.1). Luchko [23] established a maximum principle for problem (1.1), and constructed a generalized solution for the case f ≡ 0 using the multinomial Mittag-Leffler function. Jiang et al [9] derived analytical solutions for the diffusion equation with fractional derivatives in both time and space. Li and Yamamoto [20] established existence, uniqueness, and the Hölder regularity of the solution using a fixed point argument for problem (1.1) with variable coefficients {bi}. Very recently, Li et al [19] showed the uniqueness and continuous dependence of the solution on the initial value v and the source term f , by exploiting refined properties of the multinomial Mittag-Leffler function.The applications of the model (1.1) motivate the design and analysis of numerical schemes that have optimal (with respect to data regularity) convergence rates. Such schemes are especially valuable for problems where the solution has low regularity. The case m = 0, i.e., the single-term model (1.3), has been extensively studied, and stability and error estimates were provided; see [21,35] for the finite difference method, [18,34] for the spectral method, [25,27,28,12,11,10] for the finite element method, and [3,7] for meshfree methods based on radial basis functions, to name a few. In particular, in [10,11,12], the authors established almost optimal error estimates with respect to the regularity of the initial data v and the right hand side f for a semidiscrete Galerkin scheme. These studies includ...