We provide a general methodology for evaluating the optimal resource cost for an error mitigation employing methods developed in resource theories. We consider the probabilistic error cancellation as an error mitigation technique and show that the optimal sampling cost realizable using the full expressibility of near-term devices is related to a resource quantifier equipped with a framework in which noisy implementable operations are considered as the free resource, allowing us to obtain its universal bounds. As applications, we show that the cost for mitigating the depolarizing noise presented in [Temme, Bravyi, and Gambetta, Phys. Rev. Lett. 119, 180509 (2017)] is optimal, and extend the analysis to several other classes of noise model, as well as provide generic bounds applicable to general noise channels given in a certain form. Our results not only provide insights into the potential and limitations on feasible error mitigation on near-term devices but also display an application of resource theories as a useful theoretical toolkit.